Grading for Mastery in Introductory Logic

I’ve been thinking for a long time about how to do assignments, exams, and grading differently in my intro logic course. Provincial budget cuts mean my enrolment will double to 200 students in the Fall term, and the fact that it will have to be fully online raises additional challenges. So maybe now is a good time as any to rethink things!

Mastery Grading aka Standards-based Grading is an approach that’s become increasingly popular in university math courses. In points-based grading, you assign points on all your assignments and exams, and then assign grades on the basis of these points. The system relies heavily on partial credit. Students will revolt if you don’t give it, because so much can hang on fractions of a percentage point. In mastery grading, you define the learning outcomes you want students to achieve, and grade based on how many many they have achieved (and perhaps at what level they have achieved them). Big perk for the instructor: you don’t have to worry about partial credit.

In intro logic of course a great many problems are of the kind that we ordinarily think students must be able to do (so get high point values on tests) but are terribly hard to award partial credit for. If a student doesn’t get a formal proof right, do you dock points for incorrect steps? Grade “holistically” according to how far along they are? If they are asked to show that A doesn’t entail B, is an interpretation that makes A and B both true worth 50%? In mastery grading, instead it makes sense to only count correct solutions. Of course you’ll want to help students get to being able to correctly solve the problems, with a series of problems of increasing difficulty on problem sets before they are tested on a timed, proctored exam, for instance, and with opportunities to “try again” if they don’t get it right the first time.

Now for an online logic course, especially one with high enrollment like mine, academic honesty is going to be a bigger issue than if I had the ability to do proctored in-class tests. Evaluation that discourages cheating will be extra important, and one of the best ways to do that is to lower the stakes on exams. If I can have many short exams instead of a midterm and a final, I’ll have to worry about cheating less. That works out nicely if I want each exam to test for a specific learning objective. More exams also means more grading, and I have limited resources to do this by hand. Luckily most of the objectives in a formal logic course can be computer graded. I’ve already made heavy use of the Carnap system in my logic courses. One drawback is that Carnap can only tell if a solution is completely correct or not. Although partial credit functionality has been added since COVID hit, not having to manually go through a hundred half-done proofs every week will be crucial in the Fall. So mastery grading is a win-win on this front.

Assigning letter grades and incentivizing various behaviors (such as helping other students in online discussion boards) is, however, a lot harder than in a points-based approach. For this, I’m planning to use specification grading: You decide at the outset what should count as performance worthy of a specific letter grade (e.g., completing all problem all problem sets, passing 90% of quizzes an exams for an A) and then use these specifications to convert many individual all-or-nothing data points to a letter grade. To encourage a “growth mindset” (practice makes perfect) I’ll allow students to revise or repeat assignments and tests (within limits). This would be a nightmare with 200 students and 10 tests, but if they are computer graded, I just need to have two versions of each (short!) test — about the same effort as having makeup versions of two or three longer exams.

I’ve already used specifications grading in Logic II (our metatheory course), where I just copied what Nicole Wyatt had pioneered. That, I think, has worked pretty well. The challenge is to implement it in the much larger Logic I.

I have a preliminary plan (learning outcomes, activities, grade scheme, token system). That’s a google doc with commenting turned on. Please let me know what you think!

If you want more info on mastery & specs grading especially for math-y courses, check out the website for the Mastery Grading Conference just completed, especially the pre-conference assignments and the resource page. Recordings of sessions to come soon, I hear.

7 thoughts on “Grading for Mastery in Introductory Logic

  1. Grading is problematic:

    Suppose Plato gets BBBFF in five courses, and Aristotle gets CCCCC in the same courses. Who did better? Grade Point Averages are supposed to answer this question. On the traditional ABCDF = 43210 system, Plato gets a GPA of 1.8, and Aristotle gets 2.0: Aristotle did better. For years, however, Claremont Graduate School used the system ABCDF = 43100. (CGS was a graduate-only institution and thought, reasonably enough, that graduate students shouldn’t be getting Cs, and certainly not Ds or Fs, so it discounted those grades accordingly.) On CGS’s scale, Plato would still get his 1.8, but Aristotle would drop to 1.0. Not good. Aristotle could have sued when CGS was more or less forced by other schools to change to the traditional system, thereby lowering his standing relative to Plato without any change in either’s performance.

    Which system is better? Is there any reason to think CCCCC is better than BBBFF, as the traditional system has it? No. There is no reason to think, using the traditional system’s numbers, that 22222 better than 33300, and reason to think that it is not: If those numbers are the runs scored in the first five innings of a single baseball game, Team Aristotle is ahead 10-9. If the numbers are the runs scored in the first five games of the World Series, Team Plato is winning 3-2. Which is the better grade record thus turns on which analogy is right. There is no clear reason to think either analogy is better than the other, and hence no principled reason for preferring either grade record to the other.

    The same argument applies, mutatis mutandis, to efforts to sum up scores on examinations and papers to reach a final grade for a course, or even to sum up scores on the parts of an examination to reach a final grade for the examination.

    References

    The foregoing is the Reader’s Digest version of the argument of:

    John M. Vickers. Justice and Truth in Grades and their Averages. Research in Higher Education 41 (2000),141-164. Stable URL: http://www.jstor.org/stable/40196361.

    I created any errors in the summary all by my lonesome. See also:

    Alfred F. Mackay. Interpersonal Comparisons. The Journal of Philosophy 72 (1975), 535-549. Stable URL: http://www.jstor.org/stable/2025065.

    The locus classicus for all this is Kenneth J. Arrow. A Difficulty in the Concept of Social Welfare. Journal of Political Economy 58 (1950), 328-346. Stable URL: http://www.jstor.org/stable/1828886.

  2. I was wondering if I might ask a question about your experience using mastery grading? I am running a course with a rather basic version (homework can be resubmitted). And I am finding that students who do poorly on the homework, and so have reason to try and resubmit are finding the system very stressful and feel like they are falling behind. Do you have any suggestions for how to present the idea of mastery grading to avoid this?

    1. Hm, why would they be stressed out if they can resubmit homework? How long do they have?

      1. They have until the end of the course but are advised to submit it the week after. The long deadline might be part of the problem. Students who are getting quite a few questions wrong seem to feel like they now have to do this week’s homework and the questions they missed.

        1. I don’t know how to not make them feel like they’ve fallen behind. I mean, they *have* fallen behind after all. The difference is that *if they catch up* they can iron out their earlier bad grades. So they are incentivized to catch up. They may experience this not as an incentive but as coersion. Perhaps the happy medium is to tell them that if they scored poorly, they should review the material and complete the remaining problems once they have done so — for credit. Tell them that they’ll need to know it for future assignments/tests if that’s true? I give them a week before they can request a do-over (so they don’t immediately have to catch up/can plan when to find the time) and then they have two weeks to complete the work.

          With the standards-based grading approach the difference (which may help here) is that rather than a point score there is only a bar they have to clear. So a) they know how good is good enough, and b) they stress less about getting 100%.

          1. This is helpful, thank you. I have put a cool down period and that seems to have helped. Hopefully, next time I will be able to do something like standards-based grading which I think would be a better solution.

Leave a Reply

Your email address will not be published. Required fields are marked *