Scheduled to be published in the UK on August 17, 2021

An Introduction to Proof Theory provides an accessible introduction to the theory of proofs, with details worked out and many examples and exercises. It also serves as a companion to reading the original pathbreaking articles by Gerhard Gentzen. The first half covers topics in structural proof theory, including the Gödel-Gentzen translation of classical into intuitionistic logic (and arithmetic), natural deduction and the normalization theorems (for both NJ and NK), the sequent calculus, including cut-elimination and mid-sequent theorems, and various applications of these results. The second half examines Gentzen’s consistency proof for first-order Peano Arithmetic. The theory of ordinal notations and other elements of ordinal proof theory are developed from scratch. The proof methods needed, especially proof by induction, are introduced in stages throughout the text.

Paolo Mancosu has a new paper out in the Journal of Humanistic Mathematics:

This article makes available some early letters chronicling the relationship between the biologist Joseph H. Woodger and the logician Alfred Tarski. Using twenty-five unpublished letters from Tarski to Woodger preserved in the Woodger Papers at University College, London, I reconstruct their relationship for the period 1935–1950. The scientific aspects of the correspondence concern, among other things, Tarski’s reports on the work he is doing, his interests, and his — sometimes critical but always well-meaning — reactions to Woodger’s attempts at axiomatizing and formalizing biology using the system of Principia Mathematica. Perhaps the most interesting letter from a philosophical point of view is a very informative letter on nominalism dated November 21, 1948. But just as fascinating are the personal elements, the dramatic period leading to the second world war, their reaction to the war events, Tarski’s anguish for his family stranded back in Poland, the financial worries, and his first reports on life in the East Coast and, as of 1942, at the University of California, Berkeley.

There is much that is interesting in this correspondence, but what struck me most was letter 8, dated May 22, 1939. This is exactly the time when Tarski was considering offers by Quine and others to come to America. Tarski was hesitant, something we now, in hindsight, find puzzling. The Fefermans, in their biography of Tarski, also found this puzzling. Their explanation was that Tarski’s “surpreme absence of self-doubt” was the determining factor. Leśniewski had just died and, the Fefermans conjectured, Tarski was certain that as “clearly the pre-eminent logician in Poland”, he would be appointed as Leśniewski’s successor in Warsaw (p. 106–107).

Letter 8 paints a completely different picture, which any current precariously employed scientist can appreciate. Quine suggested that Tarski should travel to the US in the Fall of 1939 and stay for a year. There was no offer of employment, no grant money, nothing, just vague assurances that invitations and perhaps honoraria would be forthcoming. In Poland, Tarski made ends meet as a high-school teacher and adjunct lecturer at Warsaw University. This is Tarski’s evaluation of his situation:

It seems to me that it would be criminally reckless if I decided to follow Quine’s “suggestions” and your advice. You don’t take into account a circumstance. I have two parents, a wife and two children. And these are all people, and no marmots, no bears, etc.: they are not able to sink into sleep for a few months. If I decide to go to America and spend the next school year there (regardless of whether I get a job there), I have to inform the school that I am giving up my hours next year; as a result, a third of my income will be taken away as early as 1 August (i.e. the money I receive at school). I have to do the same at the university—I have to inform them that I do not intend to give lectures in the coming year; as a result, on 1 September, the second third of my income will be eliminated. The last third (i.e. the money I get in university for my adjunct position) will be eliminated on October 1st if I don’t take up my duty at that time. My savings are enough for the passport, the visa, the trip to America and back and still for a one-month stay in America. So what will my family live on in September? My wife earns about 6.50£ a month, but that’s too little for one person, let alone for five.

He goes on to contrast Quine’s offer to him with the much more generous and definite offers of funding to Carnap and Woodger himself. Later letters detail his repeated attempts at rescuing his family from Nazi-occupied Poland. Read the whole thing here.

A couple of days ago, Daniel Litt linked to Patrick Brosnan‘s computer-verified “proof” of the inconsistency of Peano arithmetic. The proof is correct; I just put it in quotes because it relies on a quirk of the proof verification system used (Metamath), which requires you to explicitly prohibit certain variable substitutions. The axiom of PA used in the proof of \(0=1\) is \(x < y \leftrightarrow \exists z\, y = x + z’]\). Nothing wrong with that, but you can’t allow \(y\) to be substituted by \(z\) (or \(z\) to be renamed to \(y\)). The author of the original Metamath file was Robert Solovay, so this reminded me that quite a number of famous logicians proposed axiom systems that turned out to be inconsistent (although most of them didn’t just forget to prohibit an obviously illegal substitution). Here’s a partial list; let me know of additions.

First, of course there’s Frege’s basic law V, which was shown inconsistent by Russell (Russell’s paradox). The inconsistency gave us Russell’s theory of types and Whitehead & Russell’s Principia mathematica, and, much later, the neo-logicist systems of second-order arithmetic that replace Basic Law V with Hume’s principle.

An early version of the logic underlying Principia, the so-called substitutional theory, turned out to be inconsistent. Russell found the inconsistency himself; see this paper by Bernie Linsky (thanks Landon Elkind for pointing me to this).

Hilbert and Ackermann also had notorious problems with the correct formulation of their substitution rules, although their errors in successive versions of Grundzüge der theoretischen Logik made the system only unsound and not outright inconsistent (as far as I know)—see Pager’s 1960 paper and references therein.

Quine’s system of New Foundations was originally introduced in his Mathematical Logic in 1940. The system as presented in the first (1940) edition allows the derivation of the Burali-Forti paradox. This was again proved by Rosser, who thus leads the scoreboard in number of systems shown inconsistent.

Rosser actually went on to use Quine’s NF in his own textbook Logic for Mathematicians, published in 1953. In his development of mathematics in NF, he discussed and used the axiom of choice. Unfortunately, that same year Specker showed that NF also disproves the axiom of choice (thanks to Martin Davis for pointing this out). In his review of Specker’s paper, Rosser suggested that a restriction of AC is all that’s needed for the results developed in his book. There, he only proposed countable choice as an axiom (Axiom 15, p. 512) which (as far as is known today) is consistent (thanks to Randall Holmes).

A version of the theory of constructions developed in the 1960s by Georg Kreisel and Nicolas Goodman was inconsistent as shown in Goodman’s 1968 dissertation (see this paper by Dean & Kurokawa).

Per Martin-Löf’s original (1971) system of constructive type theory was also inconsistent—shown by Jean-Yves Girard in his 1972 dissertation (Girard’s paradox), and this led to important developments in type theory.

Another famous example from set theory is the inconsistency of Reinhardt cardinals, proved by Kunen. This has perhaps a different flavor than the other examples: presumably (I don’t have access to Reinhardt’s thesis) he proposed the axiom (that there is a non-trivial elementary embedding of the universe into itself) originally as an object of investigation, rather than as something taken to be obviously true (as Frege took Basic Law V). (Thanks Toby Meadows and David Schrittesser for reminding me of this nevertheless. Factoid: Reinhardt is my academic uncle.)

Inconsistency is a problem in classical logic because from an inconsistency you can prove anything: thus classically, inconsistent theories are trivial. One way to get around this is to change the logic in such a way that inconsistencies by themselves don’t make the theory “explode”. Newton da Costa pioneered such logics and in the 1960s he developed paraconsistent set theories on their basis. The hope was that they would prove contradictions such as Russell’s paradox without thereby proving absolutely everything. Unfortunately, some of those nevertheless turned out to be trivial (shown in 1985 by Arruda; h/t Luis Estrada-González).

Panu Raatikainen reminded me that the list is sometimes referred to as Martin Davis’ honor roll (after this FOM post).

MUltlog is a Prolog program that converts a specification of a finite-valued logic (propositional or first-order) into optimal inference rules for a number of related analytic proof systems: many-sided sequent calculus, signed tableaux, many-sided natural deduction, and clause translation calculi for signed resolution. The specification of the logic can be produced by a simple TCL/TK application, iLC (due to Andreas Leitgeb).

MUltlog dates back to 1996 and was written by Gernot Salzer. The code still works, and it turned out that it’s not hard to install it on Linux — all you need is TeXLive and SWI Prolog installed, which is provided packaged in standard distributions like Ubuntu or Debian (as is TCL/TK). I’ve taken Gernot’s code, put it on Github, and gussied it up a little bit. The output of Multlog is a paper that describes the calculi for the system that you input. For instance, below you’ll find the output if you feed it paraconsistent weak Kleene logic.

UPDATE: As of v1.14, the distribution and the new website includes a number of examples, including such favorites as FDE, LP, and classical logic with some fun connectives & quantifiers.

Well, my intro to formal logic (Logic I) course is in the can. I think it was a success! I could not have done it without Graham Leach-Krouse’s Carnap system, and my excellent team (Husna Farooqui, Sarah Hatcher, Hannah O’Riain, and Dvij Raval).

Split up the material into weekly units, e.g., “Translations in TFL” or “Interpretations for nested quantifiers and identity”.

Recorded introductory lectures on each unit (typically under an hour’s worth, separated into 10-15 minute chunks). Nothing fancy, just me talking through some beamer slides.

Of the three scheduled hours for lecture, we used two for synchronous Zoom sessions where the students and I went through exercises and I answered questions.

Tutorial sections were split between more practice problems as a group with the TAs, and small groups (breakout rooms) of about 5 students working on their weekly problem sets (with the option of calling for help from the TA).

Assessment was on the basis of three weekly activities:

A problem set (collaboration allowed, all done on Carnap)

A quiz (multiple choice on the LMS, testing some concepts and applications that Carnap isn’t set up to do).

A “challenge problem,” also done on Carnap: a timed exercise where collaboration was not allowed.

Quizzes and challenge problems replaced exams: in fact, I used the same kinds of questions I would have asked on an exam. The idea is that they work on the problem set during the week, which prepares them for the challenge problem, then they basically take an exam consisting of 10 multiple choice questions plus one typical, not-too-hard exercise.

Problem sets and quizzes had about 8 questions I would expect a B student to be able to do, and 2 harder questions which they have to complete if they want an A. They were graded incomplete/complete/complete+. Challenge problems are just incomplete/complete. To get a B, you have to complete 10 of the 12 units. For an A you have to complete all of them, and on some number of problem sets and quizzes you need complete+. (I had set that number at 6/12, and I got a lot of As. So next time it’ll be 10/12 for an A.)

This system worked extremely well to keep the students on track. Every week had the same structure, they didn’t really have to keep track of any deadlines: every Monday they had three things due and basically 5 days to do it.

In keeping with the idea that what counts is achieving proficiency (“mastering”) the material, but not how long it takes them to do it, I allowed them to re-do things that they couldn’t get done on the first try. Every student had 6 “tokens” that would buy them a do-over of one of the activities. Every week they got to tell us what they wanted to re-do, then we gave them another shot, and replaced their mark when they scored better.

If you want to see the outline for next term, which includes the list of topics, and more detail on all of the above (and including some tweaks to the grade scheme in response to this term’s experience) see Logic I (Phil 279).

We investigate a recent proposal for modal hypersequent calculi. The interpretation of relational hypersequents incorporates an accessibility relation along the hypersequent. These systems give the same interpretation of hypersequents as Lellman’s linear nested sequents, but were developed independently by Restall for S5 and extended to other normal modal logics by Parisi. The resulting systems obey Došen’s principle: the modal rules are the same across different modal logics. Different modal systems only differ in the presence or absence of external structural rules. With the exception of S5, the systems are modular in the sense that different structural rules capture different properties of the accessibility relation. We provide the first direct semantical cut-free completeness proofs for K, T, and D, and show how this method fails in the case of B and S4.

Any set of truth-functional connectives has sequent calculus rules that can be generated systematically from the truth tables of the connectives. Such a sequent calculus gives rise to a multi-conclusion natural deduction system and to a version of Parigot’s free deduction. The elimination rules are “general,” but can be systematically simplified. Cut-elimination and normalization hold. Restriction to a single formula in the succedent yields intuitionistic versions of these systems. The rules also yield generalized lambda calculi providing proof terms for natural deduction proofs as in the Curry-Howard isomorphism. Addition of an indirect proof rule yields classical single-conclusion versions of these systems. Gentzen’s standard systems arise as special cases.

I’ve been thinking for a long time about how to do assignments, exams, and grading differently in my intro logic course. Provincial budget cuts mean my enrolment will double to 200 students in the Fall term, and the fact that it will have to be fully online raises additional challenges. So maybe now is a good time as any to rethink things!

Mastery Grading aka Standards-based Grading is an approach that’s become increasingly popular in university math courses. In points-based grading, you assign points on all your assignments and exams, and then assign grades on the basis of these points. The system relies heavily on partial credit. Students will revolt if you don’t give it, because so much can hang on fractions of a percentage point. In mastery grading, you define the learning outcomes you want students to achieve, and grade based on how many many they have achieved (and perhaps at what level they have achieved them). Big perk for the instructor: you don’t have to worry about partial credit.

In intro logic of course a great many problems are of the kind that we ordinarily think students must be able to do (so get high point values on tests) but are terribly hard to award partial credit for. If a student doesn’t get a formal proof right, do you dock points for incorrect steps? Grade “holistically” according to how far along they are? If they are asked to show that A doesn’t entail B, is an interpretation that makes A and B both true worth 50%? In mastery grading, instead it makes sense to only count correct solutions. Of course you’ll want to help students get to being able to correctly solve the problems, with a series of problems of increasing difficulty on problem sets before they are tested on a timed, proctored exam, for instance, and with opportunities to “try again” if they don’t get it right the first time.

Now for an online logic course, especially one with high enrollment like mine, academic honesty is going to be a bigger issue than if I had the ability to do proctored in-class tests. Evaluation that discourages cheating will be extra important, and one of the best ways to do that is to lower the stakes on exams. If I can have many short exams instead of a midterm and a final, I’ll have to worry about cheating less. That works out nicely if I want each exam to test for a specific learning objective. More exams also means more grading, and I have limited resources to do this by hand. Luckily most of the objectives in a formal logic course can be computer graded. I’ve already made heavy use of the Carnap system in my logic courses. One drawback is that Carnap can only tell if a solution is completely correct or not. Although partial credit functionality has been added since COVID hit, not having to manually go through a hundred half-done proofs every week will be crucial in the Fall. So mastery grading is a win-win on this front.

Assigning letter grades and incentivizing various behaviors (such as helping other students in online discussion boards) is, however, a lot harder than in a points-based approach. For this, I’m planning to use specification grading: You decide at the outset what should count as performance worthy of a specific letter grade (e.g., completing all problem all problem sets, passing 90% of quizzes an exams for an A) and then use these specifications to convert many individual all-or-nothing data points to a letter grade. To encourage a “growth mindset” (practice makes perfect) I’ll allow students to revise or repeat assignments and tests (within limits). This would be a nightmare with 200 students and 10 tests, but if they are computer graded, I just need to have two versions of each (short!) test — about the same effort as having makeup versions of two or three longer exams.

I’ve already used specifications grading in Logic II (our metatheory course), where I just copied what Nicole Wyatt had pioneered. That, I think, has worked pretty well. The challenge is to implement it in the much larger Logic I.

I have a preliminary plan (learning outcomes, activities, grade scheme, token system). That’s a google doc with commenting turned on. Please let me know what you think!

If you want more info on mastery & specs grading especially for math-y courses, check out the website for the Mastery Grading Conference just completed, especially the pre-conference assignments and the resource page. Recordings of sessions to come soon, I hear.

When you define satisfaction for quantified formulas, e.g., \(\forall x\, A\), you have to have a way to make \(x\) range over all elements of the domain. Here are the common options:

A. Tarski-style: use variable assignments \(s\colon V \to D\) (where \(V\) is the set of variables and \(D\) the domain), then define \[\mathfrak{M}, s \models \forall x \, A \iff \mathfrak{M}, s’ \models A \text{ for all $x$-variants $s’$ of $s$}.\] This requires a definition of “\(x\)-variant” but is otherwise very clean. Its drawback is that it obscures how we let \(x\) range over all elements of \(D\), and my students have a hard time understanding it and an even harder time working with it.

B. Alternative Tarski-style: we use variable assignments as above, but avoid talking about \(x\)-variants. Instead, we define the notation \(s[m/x]\), the variable assignment just like \(s\) except it assigns \(m \in D\) to \(x\). Then we have \[\mathfrak{M}, s \models \forall x \, A \iff \mathfrak{M}, s[m/x] \models A \text{ for all } m \in D.\]

C. Model theory style: instead of introducing variable assignments that provide the interpretation for variables, we define directly when a formula is satisfied by a sequence of objects: if the variables of \(A\) are among \(y_1, \dots, y_k\) then \(\mathfrak{M} \models A[n_1,\dots, n_k]\) means what \(\mathfrak{M}, s \models A\) means Tarski-style for the assignment \(s\) that maps each \(y_i\) to \(n_i\). Then the clause for the universal quantifier becomes \[\mathfrak{M} \models \forall x \, A[n_1, \dots, n_k] \iff \mathfrak{M} \models A[m, n_1, \dots, n_k] \text{ for all } m \in D.\] This is simple in that it avoids an intermediary function, but can easily be confusing for beginning students because it is neither clean nor precise. We have to understand that \(A\) is a formula with the free variables \(x, y_1\, \dots, y_k\). But what determines the order? Or, put another way, which object interprets which variable?

D. In logic textbooks for philosophers you often see semantics developed for sentences only (i.e., formulas with free variables are avoided). Given a structure \(\mathfrak{M}\) we can define \(\mathfrak{M}[m/a]\) as the structure that’s just like \(\mathfrak{M}\) except the constant \(a\) is interpreted by \(m\in D\). Then we can define truth (not satisfaction) using \[\mathfrak{M} \models \forall x \, A \iff \mathfrak{M}[m/a] \models A[a/x] \text{ for all } m \in D,\] where \(A[a/x]\) is the substitution operation and \(a\) is a constant not already occurring in \(A\).

E. Finally, there’s Robinson-style: we treat every \(m\in D\) as a constant symbol that names itself. Then substituting \(m\) for \(x\) in \(A\) is possible, since \(m\) belongs to both the domain of the structure and to the language, and we can write \[\mathfrak{M} \models \forall x \, A \iff \mathfrak{M} \models A[m/x] \text{ for all } m \in D.\] Naturally, this is not something philosophers like to do because it just seems confused to allow domain elements to function as linguistic symbols naming themselves.

Maybe I’ll find the time to write a paper tracing the origins of all of these at some point. But for now, I wonder: which way is best, pedagogically? Specifically, the Open Logic Project uses Tarski-style, but I’d like to replace it with a definition that is easier to understand and avoids the use of \(x\)-variants. Which would you prefer for the OLP? Which do you prefer in your own teaching or research and why?

So, we’re all moving to online courses, and for some of us that means we have to figure out how to switch from scribbling feedback and letter grades on papers, handing them back to students, and turning those letter grades into a course grade at the end. Most of us are using learning management systems (LMS, such as Brightspace/D2L, Blackboard, Canvas, Moodle) to do this already, and now it’s just a matter of keeping track of papers entirely in the LMS, rather than just entering grades. You probably already have a system for converting between letter grades and percentages, which the LMS use to calculate overall grades based on weights of various course components. If so, it’s probably easiest to stick to that. This post is to record how to do two things (mainly in Brightspace/D2L since that’s what we use): a) get the LMS to offer letter grades (or other scales) as grade choices when grading a paper, and b) to set up a grade scheme that keeps track of (averages and weights) of letter grades without a percentage scheme.

For a), you will need a Grade Scheme (see “Schemes” tab on the “Grades” screen) that contains all the letter grades you want to assign. For each grade scale, you need a separate scheme. You might have one scheme for all letter grades (including + and – grades), or one for letter grades and slash grades (A, A/B, B, etc.), or even one with just ✓, ✓+, and ✓-. Once you have such a scheme, you can make the grade item corresponding to the assignment you want to receive, assess, and return in the D2L dropbox of type “Selectbox”. For a selectbox grade item, you have to pick a grade scheme. When you then associate this grade item with a dropbox folder, the assessment pane will give you a drop-down menu rather than a box to enter a numerical score in:

To set up the grade scheme in the first place, and to solve problem b), we have to do some math. D2L does grade schemes in percentages, but we think more naturally of grade point values: an A+ is 4.3, and A is 4, and A- is 3.7, etc. To convert these into percentages, just divide by the maximum score (i.e., 4.3): A+ is 100%, A is 4/4.3 = 93.02%, A- is 3.7/4.3 = 86.04% etc. Do the same for any other grade scheme you want to use. E.g., if you want slash grades, you’d assign 3.5 to A/B, and 3.5/4.3 = 81.50%.

Note that whatever the maximum score is in your grade scheme here should also be the “maximum score” in any individual grade item that uses this score.

The grade scheme in D2L asks for not just an “assigned value %” for each letter grade (or other scale item), but also for a “start %”. If you average the grades (or rather, their “assigned %” values), you may get a value that does not correspond precisely to a letter grade. You have to decide where to start to “round up”. Say, you have three papers. If you want a student to get an A overall only if they get three As, then the “start %” should be the same as the “assigned value %” for an A (93.02%). But maybe you want to give them an A if they turn in two As and an A- (or some other threshold, e.g., 3 As and 2 A- on 5 papers, etc.). If so, compute the average of the grade point values of the threshold pattern, e.g., if A/A/A- earns an A, (4.0+4.0+3.7)/3 = 3.9. Then convert that to a percentage: 3.9/4.3 = 90.70. You probably want to be careful with the start % value of a D: If you have three papers and want to pass a student with a D if they have turned in 2/3 papers with Ds, the start % is (1+1+0)/3/4.3, i.e., 15.50%. But if you have many assignments, a low percentage like that will make it possible to earn a D with an A on a single assignment (if you have 6 papers, and a student gets an A on the first and then never submits another paper, they will earn 15.5% overall. But probably you don’t want to pass that student). For this reason, I like to make the start value of a D the percentage equivalent of 0.9/4.3, or 20.93).

If you’re using some other LMS, you will have to figure out how to do all this there. E.g., Moodle has grading scales (corresponding to D2L’s “assigned %” scale) and also a course-wide system of converting percentages to letter grades, which corresponds to D2L’s “start %” scale.

You can use this Letter Grade Scheme together with other schemes. For instance, you may use it only as the grade scheme for the final grade, but assess other assignments on a more coarse-grained basis, such as slash grades. Some items may even just get a pass/fail or turned-in/not-turned-in assessment. A popular scale for such assignments is ✓, ✓+, and ✓-. You’ll need to decide to assign letter grade or grade point equivalents to such grades for the purpose of calculation (perhaps A, A+, B+) and use these to compute the “assigned %” for your ✓+/- grade scheme. (Unless you want to display a category average using these alternative grade schemes, you can set the “start %” equal to the “assigned %”.) For a scale like ✓+/- remember to add a grade value corresponding to “not turned in” (with assigned value 0%), or else you won’t be able to distinguish between an assignment that’s received a ✓- and one that’s missing.

A final tip: if you’re using grade schemes, having D2L show you the grade symbol and also the percentages will clutter your grade sheet view. So when you create your selectbox grade items, check “override display options” and only leave “scheme symbol” (and maybe “scheme color”) checked. This will also keep students from being confused.

problem sets, which are mostly completed online on Carnap and graded automatically (see here if you want to use Carnap with a different textbook)

practice problems for Carnap (accessible on carnap.io as well)

3 tests (only one converted to online/carnap so far)

Here are the beamer slides. If you’re an instructor and want the sources, drop me an email at rzach@ucalgary.ca. (Of course you’ll also get the sources to the problem sets etc.)

Well, all my logic lectures moved online as of last week. It’s been a bit of a scramble, as I’m sure it’s been for you as well. I needed to rapidly produce videos of lectures (on logic in my case) I can give with students to watch. I thought I’d quickly share what I’m doing in case you’re in a similar situation.

My laptop runs Linux (Ubuntu 19.10 to be specific). So there are few options. If you’re on a Mac or Windows machine, there’s lots and you probably don’t need any help. Maybe your University even has a preferred solution for screencasting that integrates with your LMS.

For screencast recording on Linux I find Kazam works fine. It’s super-simple, all it does is record the microphone (or computer speaker output) together with whatever happens on your screen (or in a window). So if you want to show your students how to work Overleaf or Carnap or whatever, or if you want to show them a beamer presentation and talk over it, that’s all you need. (Well, you might want to invest in a decent microphone.)

But what if your lecture is chalk-and-talk? You need a way to let yourself “write on the board” while you talk through your proof or whatever. For that you need a handwriting/sketching app and a way to write comfortably (touchscreen/tablet and stylus). I did get a stylus and an Android tablet and tried out a few handwriting apps, but I couldn’t get the palm rejection to work on any of them. (If you rest your palm on the screen, the tablet won’t recognize what your stylus is doing, so you need an app or a stylus that can isolate the stylus from your hand. I’m told iPads are better for that and/or there are active styluses that have palm rejection built in. Not going to buy an iPad just to try that out though.)

I also have a Wacom Intuos writing tablet I got last week in panicked anticipation ($70 US/CAD 90). It works with Linux (plug-and-play USB), just takes a little getting used to. For a handwriting app, I discovered StylusLabs Write. It works really nice. I just fire it up, hit record on Kazam, start writing. It can easily add a new page/whiteboard area, you can scroll back to a previous one easily, and in the end you can save the whiteboard as a PDF. Here’s an example of me talking through the truth lemma in the completeness proof. {Update: See comment below for a vote for xournal++.} {Update 2: OpenBoard now runs on Ubuntu 20.04 — full-feature whiteboard with PDF import functionality and built-in screencast support!}

I would prefer to use my ReMarkable for all of this: it has a desktop app for Mac & Windows that shows what you’re drawing on it. So if you have one, try that out! I was hoping to make it work in Linux using srvfb, but have to wait until ReMarkable fixes a bug that turned off ssh access to the tablet. Will let you know what I find out.

{UPDATE (02/2021): I now use tableaunoir ! It is online, supports sharing (your students can also draw on the blackboard) and also LaTeX code. It has chalkboard sound effects. Special feature: You can also turn parts of the board into “magnets” which you can then move around. As part of a diagram, edges between magnets move with them if you move the magnets.}

A couple of years ago I posted a roundup of interactive logic courseware with an automatic grading component. The favorite commercial solution is Barwise & Etchemendy’s Language, Proof, and Logic textbook that comes with software for doing truth tables, natural deduction proofs, and semantics for propositional and first-order logic, which also automatically grades student’s solutions. The problem is, that (a) it costs money and (b) will only grade problems from that textbook. I already pointed to the open-source, and free-to-use alternative Carnap by Graham Leach-Krouse and Jake Ehrlich. Graham wrote a guest post on Daily Nous about it a while ago. I’ve now used this myself with great success and thought I’d write up my experience.

Carnap is an online tool that allows you to do the following. You can upload webpages (written in a variant of Markdown) which may include logic problems of various sorts. These are, right now:

Translation exercises (i.e., you provide a sentence in English and the student’s task is to symbolize it in propositional or first-order logic).

Truth tables (you give sentence(s) of propositional logic, the student must fill in a truth table, and use it to determine, say, if an argument is valid, a sentence is a tautology, or if two sentences are equivalent, etc.).

Derivations (you provide a sentence or argument and the student has to construct a formal proof for it).

Model construction (you provide a sentence or argument, the student has to give a domain and extensions of predicates to make the sentence true, false, or show that the argument is invalid, etc.).

Basic multiple choice and short-answer questions.

Carnap comes with its own textbook and a collection of pre-made problem sets. But you can make up your own problem sets. That’s of course a bit of work, but you have complete control over the problems you want to assign. Here are some sample exercises that go with the Calgary version of forall x:

These are pages I give to my students to get them to become familiar with Carnap before they have to actually do problems for credit. The main difference is that for a real problem set, each exercise has a “submit” button that the student can click once they’ve found a correct solution.

To get an idea of how these problem sets are written, have a look at the documentation.

As you see, the problems are interactive: the student enters the solution, and Carnap will tell them if the solution is correct. In the case of derivations, it will also provide some feedback, e.g., tell the student why a particular application of a rule is incorrect.

You can assign a point value to each problem. Carnap also allows you to set up a course, let students sign up for the course, and lets you assign the pages you’ve created as problem sets. It will allow students to submit problems they have correctly solved, and Carnap will tally the point score earned. You can then download a spreadsheet of student scores per problem set and assign marks on the basis of that.

As you see, Carnap is incredibly flexible. Moreover, it supports the syntax and proof rules of a number of popular textbooks. I’ll highlight the free/open ones:

Commercial texts supported by Carnap, which you would be evil to make your students buy of course, are:

Bergman, Moore, Nelson, The Logic Book (McGraw-Hill, $130)

Goldfarb, Deductive Logic (Hackett, $39)

Hausman, Kahane, Tidman, Logic and Philosophy (Wadsworth, $120)

Howard-Snyder, Howard-Snyder, Wasserman, The Power of Logic (McGraw-Hill, $130)

Kalish and Montague, Logic: Techniques of Formal Reasoning (Oxford, $90)

Tomassi, Logic (Routledge, $54)

All of these textbooks use a linear version of natural deduction (Fitch, Lemmon, or Montague), but Carnap now also has proof editors for Gentzen-style sequent calculus and natural deduction proofs and checks them for correctness.

How does it support different textbooks? Basically, the document you upload just tells Carnap, say, what sentence you want the student to produce as a translation, or what sentence you want them to give a proof of. You can change the “system” in which they do that, and based on that Carnap will show them the symbols differently (e.g., will ask them to do a truth table for \((\lnot P \land Q) \to R\) or for \((\mathord\sim P \mathbin\& Q) \supset R\)), and and will accept and display proofs in different formats and allow different rules. Even if your favorite text doesn’t show up above it’s likely that it is already partially supported. Graham is also incredibly helpful and responsive; last term he introduced new proof system systems and other features based on my request, often within days. (Bug reports and feature requests are handled via GitHub.)

Carnap is already pretty smart. It will accept not only solutions to translation questions that are syntactically identical to the model solution, but any equivalent solution (the equivalence check is not perfect for the first-order case, but will generally accept any reasonable equivalent translation). Graham has recently introduced a few new features, e.g., you can randomize problems for different students, or require that some conditions are met for translation problems (e.g., that the translation only uses certain connectives or is in a normal form).

To get set up, just email Graham. Once you have an instructor account and are logged in, you’ll be able to see the actual problem sets I assign in my class. You’re welcome to copy and use them of course! (If you happen to use a different textbook, you’ll just have to adjust the terminology and change the “system” Carnap is supposed to use in each problem.) Check here for more of the course, like lecture slides.

In Gabriele M. Mras, Paul Weingartner & Bernhard Ritter (eds.), Philosophy of Logic and Mathematics. Proceedings of the 41st International Ludwig Wittgenstein Symposium. Berlin: De Gruyter. pp. 313-326 (2019)

The Curry-Howard isomorphism is a proof-theoretic result that establishes a connection between derivations in natural deduction and terms in typed lambda calculus. It is an important proof-theoretic result, but also underlies the development of type systems for programming languages. This fact suggests a potential importance of the result for a philosophy of code.

A textbook on modal and other intensional logics based on the Open Logic Project. It covers normal modal logics, relational semantics, axiomatic and tableaux proof systems, intuitionistic logic, and counterfactual conditionals.

Textbook on Gödel’s incompleteness theorems and computability theory, developed for Calgary’s Logic III course, based on the Open Logic Project. Covers recursive function theory, arithmetization of syntax, the first and second incompleteness theorem, models of arithmetic, second-order logic, and the lambda calculus.