Logic Colloquium, Udine

The European Summer Meeting of the Association of Symbolic Logic will be in Udine, just north of Venice, July 23-28. Abstracts for contributed talks are due on April 27. Student members of the ASL are eligible for travel grants!

lc18.uniud.it

 

The Significance of Philosophy to Mathematics

If you wanted to explain how philosophy has been important to mathematics, and why it can and should continue to be, it would be hard to do it better than Jeremy Avigad. In this beautiful plea for a mathematically relevant philosophy of mathematics disguised as a book review he writes:

Throughout the centuries, there has been considerable interaction between philosophy and mathematics, with no sharp line dividing the two. René Descartes encouraged a fundamental mathematization of the sciences and laid the philosophical groundwork to support it, thereby launching modern science and modern philosophy in one fell swoop. In his time, Leibniz was best known for metaphysical views that he derived from his unpublished work in logic. Seventeenth-century scientists were known as natural philosophers; Newton’s theory of gravitation, positing action at a distance, upended Boyle’s mechanical philosophy; and early modern philosophy, and philosophy ever since, has had to deal with the problem of how, and to what extent, mathematical models can explain physical phenomena. Statistics emerged as a response to skeptical concerns raised by the philosopher David Hume as to how we draw reliable conclusions from regularities that we observe. Laplace’s Essai philosophique sur la probabilités, a philosophical exploration of the nature of probability, served as an introduction to his monumental mathematical work, Théorie analytique des probabilités.

 

In these examples, the influence runs in both directions, with mathematical and scientific advances informing philosophical work, and the converse. Riemann’s revolutionary Habilitation lecture of 1854, Über die Hypothesen welche der Geometrie zu Grunde liegen (“On the hypotheses that lie at the foundations of geometry”), was influenced by his reading of the neo-Kantian philosopher Herbart. Gottlob Frege, the founder of analytic philosophy, was a professor of mathematics in Jena who wrote his doctoral dissertation on the representation of ideal elements in projective geometry. Late nineteenth-century mathematical developments, which came to a head in the early twentieth-century crisis of foundations, provoked strong reactions from all the leading figures in mathematics: Dedekind, Kronecker, Cantor, Hilbert, Poincaré, Hadamard, Borel, Lebesgue, Brouwer, Weyl, and von Neumann all weighed in on the sweeping changes that were taking place, drawing on fundamentally philosophical positions to support their views. Bertrand Russell and G. H. Hardy exchanged letters on logic, set theory, and the foundations of mathematics. F. P. Ramsey’s contributions to combinatorics, probability, and economics played a part in his philosophical theories of knowledge, rationality, and the foundations of mathematics. Alan Turing was an active participant in Wittgenstein’s 1939 lectures on the foundations of mathematics and brought his theory of computability to bear on problems in the philosophy of mind and the foundations of mathematics.

Go and read the whole thing, please. And feel free to suggest other examples!

The book reviewed is Proof and Other Dilemmas: Mathematics and Philosophy, Bonnie Gold and Roger A. Simons, eds., Mathematical Association of America, 2008

[Photo: Bertrand Russell and G. H. Hardy as portrayed by Jeremy Northam and Jeremy Irons in The Man Who Knew Infinity, via MovieStillsDB]

Ptolemaic Astronomy

Working on the chapters on counterfactual conditionals for the Open Logic Project, I needed some illustrations for David Lewis’s sphere models, which he jokingly called “Ptolemaic astronomy.” Since Franz Berto joked that this should just require \usepackage{ptolemaicastronomy}, I wrote some LaTeX macros to make this easier using TikZ. You can download ptolemaicastronomy.sty (it should work independently of OLP); examples are in the OLP chapter on minimal change semantics (PDF, source).

(This will probably interest a total of two people other than me so I didn’t spend much time documenting it, but if you want to use it and need help just comment here.)

Update: it’s now in its own github repository and properly documented.

A New University of Calgary LaTeX Thesis Class based on Memoir

The University of Calgary provides a LaTeX thesis class on its website. That class is based on the original thesis class, modified over the years to keep up with changes to the thesis guidelines of the Faculty of Graduate studies. It produces atrocious results. Chapter headings are not aligned properly. Margins are set to 1 inch on all sides, which results in unreadably long lines of text. The template provided sets the typeface to Times New Roman. Urgh.  A better class (by Mark Girard) is already available, which however also sets the margins to 1 inch. FGS no longer requires that the margins be exactly 1 inch, just that they are at a minimum 1 inch. So we are no longer forced to produce that atrocious page layout.

I made a new thesis class. It’s based on memoir, which provides some nice functionality to compute an attractive page layout. By default, the class sets the thesis halfspaced, 11 point type, and with about 65 characters per line. This produces a page approximating a nicely laid out book page.  The manuscript class option sets it up for 12 point, double spaced, with 72 characters per line, and 25 lines per page. That’s still readable, but gives you extra space between the lines for annotations and editing marks, and wider margins. There are also class options to load some decent typefaces (palatino, utopia, garamond, libertine, and, ok, times).

Once upon a time, theses were typed on a typewriter and submitted to the examination committee in hardcopy. Typewriter fonts are “monospaced,” i.e., every character takes the same amount of space. “Elite” typewriters would print 12 characters per inch, or 72 characters per 6 inch line, and “Pica” typewriters 10 cpi, or 60 characters per line. Typewriters fit 6 lines into a vertical inch, or 25 lines per double-spaced page. A word is on average 5 characters long, hence we get about 250 words per manuscript page.

Noone uses typewriters anymore to write theses, but thesis style guidelines are still a holdover from the time we did. The guidelines still require that theses be halfspaced or double spaced. But of course they allow use of word processing software. Those don’t use monospaced typewriter fonts, and the recommended typefaces such as Times Roman are much more narrow and proportionally spaced. That means even with 12 point type, a 6” line now contains 89 characters on average, rather than 60. (Chris Pearson has estimated “character constants” for various typefaces which you can use to estimate the average number of characters per inch in various type sizes. For Times New Roman, the factor is 2.48. At a line length of 6”, i.e., 432 pt, and 12 pt type that gives 432 × (2.48/12)=89.28 characters per line. With minimal margins of 1” you get 96 characters per line.)

Applying typewriter rules to electronically typeset manuscripts results in lines that are very long—and that means they are hard to read. Ideally, there should be anywhere between 50 and 75 characters per line, and 66 characters is widely considered ideal. Readability is a virtue you want your thesis to have. And the thesis guidelines, thankfully, no longer set the margins, but only require minimum margins of 1” on all sides.

sample-thesis

Modal Logic! Propositional Logic! Tableaux!

Lots of new stuff in the Open Logic repository! I’m teaching modal logic this term, and my ambitious goal is to have, by the end of term or soon thereafter, another nicely organized and typeset open textbook on modal logic. The working title is Boxes and Diamonds, and you can check out what’s there so far on the builds site. This project of course required new material on modal logic.  So far this consists in revised and expanded notes by our dear late colleague Aldo Antonelli. These now live in content/normal-modal-logic and cover relational models for normal modal logics, frame correspondence, derivations, canonical models, and filtrations. So that’s one big exciting addition. Since the OLP didn’t cover propositional logic separately, I just now added that part as well so I can include it as review chapters. There’s a short chapter on truth-value semantics in propositional-logic/syntax-and-semantics. However, all the proof systems and completeness for them are covered as well. I didn’t write anything new for those, but rather made the respective sections for first-order logic flexible. OLP now has an FOL “tag”: if FOL is set to true, and you compile the chapter on the sequent calculus, say, you get the full first-order version with soundness proved relative to first-order structures. If FOL is set to false, the rules for the quantifiers and identity are omitted, and soundness is proved relative to propositional valuations. The same goes for the completeness theorem: with FOL set to false, it leaves out the Henkin construction and constructs a valuation from a complete consistent set rather than a term model from a saturated complete consistent set. This works fine if you need only one or the other; if you want both, you’ll currently get a lot of repetition. I hope to add code so that you can first compile without FOL then with, and the second pass will refer to the text produced by the first pass rather than do everything from scratch. You can compare the two versions in the complete PDF. Proofs systems for modal logics are tricky; and many systems don’t have nice, say, natural deduction systems. The tableau method, however, works very nicely and uniformly. The OLP didn’t have a chapter on tableaux, so this motivated me to add that as well. Tableaux are also often covered in intro logic courses (often called “truth trees”), so having them as a proof system included has the added advantage of tying in better with introductory logic material. I opted for prefixed tableaux (true and false are explicitly labelled, rather than implicit in negated and unnegated formulas), since that lends itself more easily to a comparison with the sequent calculus, but also because it extends directly to many-valued logics. The material on tableaux lives in first-order-logic/tableaux. Thanks to Clea Rees for the the prooftrees package, which made it much easier to typeset the tableaux, and to Alex Kocurek for his tips on doing modal diagrams in Tikz.

Making an Accessible Open Logic Textbook (for Dyslexics)

In the design and layout of the Open Logic Project texts as well as the Calgary Remix of the intro text forall x, we’ve tried to follow the recommendations of the BC Open Textbook Accessibility Toolkit already. Content is organized into sections, important concepts are highlighted (e.g., colored boxes around definitions and theorems), chapters have summaries, etc. We picked an easily readable typeface and set line and page lengths to enhance readability according to best (text)book design practices and research. We’ve started experimenting specifically with a version of forall x that is better for dyslexic readers (see issue 22). Readability for dyslexics is affected by typeface, type size, letter and line spacing. Charles Bigelow gives a good overview of the literature here. Some typefaces are better for dyslexic readers than others. Generally, sans-serif fonts are preferable, but individual letter design is also relevant. The British Dyslexia Association has a page on it: the design of letters should make it easy to distinguish letters, not just when they are close in shape (e.g., numeral 1, uppercase I and lowercase l; numeral 0, uppercase O and lowercase o, lowercase a) but also when they are upside-down or mirror images (e.g., p and q, b and d; M and W). In one study of reading times and reported preference, sans-serif fonts Arial, Helvetica, and Verdana ranked better than other fonts such as Myriad, Courier, Times, and Garamond, and even the specially designed Open Dyslexic typeface. Although it would be possible to get LaTeX to output in any available typefaces, it’s perhaps easiest to stick to those that come in the standard LaTeX distributions. The typeface that strikes me as best from the readability perspective seems to me to be Go Sans. It was designed by Bigelow & Holmes with readability in mind and does distinguish nicely between p and q; b and d; I, l, and 1, etc. Other things that improve readability:
  • larger type size
  • shorter lines
  • increased line spacing
  • increased character spacing, i.e., “tracking” (although see Bigelow’s post for conflicting evidence)
  • avoid ALL CAPS and italics
  • avoid word hyphenation and right justified margins
  • avoid centered text
The accessible version of forall x does all these things: Type size is set to 12 pt (not optimal on paper, but since this PDF would mainly be read on a screen, it looks large enough). Lines are shorter (about 40 instead of 65 characters per line). Line spacing is set at 1.4 line heights. Tracking is increased slightly, and ligatures (ff, fi, ffi) are disabled. Emphasis and defined terms are set in boldface instead of italics and small caps. Lines are set flush left/ragged right and words not hyphenated. The centered part headings are now also set flush left. The changes did break some of the page layout, especially in the quick reference, which still has to be fixed. There is also some content review to do. In “Mhy Bib I Fail Logic? Dyslexia in the Teaching of Logic,” Xóchitl Martínez Nava suggests avoiding symbols that are easily confused (i.e., don’t use ∧ and ∨), avoid formulas that mix letters and symbols that are easily confused (e.g., A and ∀, E and ∃), and avoid letters in the same example that are easily confused (p, q). She also recommends to introduce Polish notation in addition to infix notation, which would not be a bad idea anyway. Polish notation, I’m told, would also be much better for blind students who rely on screen readers or Braille displays. (The entire article is worth reading; h/t to Shen-yi Liao.) Feedback and comments welcome, especially if you’re dyslexic! There’s a lot more to be done, of course, especially to make the PDFs accessible to the vision-impaired. LaTeX and PDF are very good at producing visually nice output, but not good at producing output that is suitable for screen readers, for instance. OLP issue 82 is there to remind me to get OLP output that verifies as PDF/A compliant, which means in particular that the output PDF will have ASCII alternatives to all formulas, so that a screen reader can read them aloud. Even better would be a good way to convert the whole thing to HTML/MathML (forall x issue 23). forallxyyc-accessible

Logic Courseware?

Kit Fine asked me for suggestions of online logic materials that have some interactive component, i.e., ways for students to build truth-tables, evaluate arguments, translate sentences, build models, and do derivations; ideally it would not just provide feedback to the student but also grade problems and tests. There is of course Barwise & Etchemendy’s Language, Proof, and Logic, which comes with software to do these things very well and also has a grading service. But are there things that are free, preferably online, preferably open source?

  • First we have David Kaplan’s Logic 2010. It’s written in Java, runs on Windows and Mac, is free but not open source, and has a free online grading component. It goes with Terry Parson’s An Exposition of Symbolic Logic, which is also free. To use the software and grading service, you’d have to make arrangements with David. The text does propositional and first-order logic including models and Kalish-Montague derivations. I haven’t tried the software, but it’s used in a number of places.
    [Free software Free book Online ✗ Open source ✗]
  • UPDATE: Carnap is an open source framework for writing webapps for teaching logic written by Graham Leach-Krouse and Jake Ehrlich. It comes with a (free, but not openly licensed) online book, and currently can check truth tables, translations, and Kalish-Montague derivations (and they are working on first-order models). Students can have accounts and submit exercises. The software is written in Haskell and is open-source (see Github). It’s used at Kansas Sate and the University of Birmingham.
    [Free software Free book Online Open source ]
  • Kevin Klement is teaching logic from the (free) book by Hardegree, Symbolic Logic: A First Course. (There’s a newer version that doesn’t seem to be freely available.) He has an online component (exercises and practice exams) with multiple-choice questions, truth tables, translations, and Fitch-style derivations. I’m not sure if the backend code for all of this is available and could be adapted to your own needs. He has provided a version of the proof checker that works with the Cambridge and Calgary versions of forall x, and that code is open source, however. I’m not sure if it’s possible to add the functionality he has on the UMass site for saving student work. Neither the book nor the online exercises cover models for first-order logic.
    [Free software ✓ Free book ✓ Online ✓ Open source ?]
  • The Logic Daemon by Colin Allen and Chris Menzel accompanies Allen and Michael Hand’s Logic Primer. It can check truth-tables, models, and Suppes-Lemmon derivations, and generate quizzes. The interface is basic but the functionality is extensive. There doesn’t seem to be a grading option, however. Software seems to be written in Perl, I didn’t see the source code available.
    [Free software ✓ Free book ✗ Online ✓ Open source ✗]
  • Then there is Ray Jennings and Nicole Friedrich’s Project Ara, which includes Simon, a logic tutor, and Simon Says, a grading program. The textbook is Proof and Consequence, published by Broadview (ie, not free). It does truth-tables, translations, and Suppes-style derivations, and also no models. It requires installing software on your own computer, but it’s free and runs on Windows, Mac, and Linux. The software is free but not open source. I haven’t tried it out. (That website though!)
    [Free software ✓ Free book ✗ Online ✗ Open source ✗]
  • Wilfried Sieg’s group has developed AProS, which includes proof and counterexample construction tools. I don’t think these are openly available, however. It’s used in Logic & Proofs, offered through CMU’s Open Learning Initiative. According to the description, it’s available both as a self-paced course and for other academic institutions to use for a self-paced format or for a traditional course with computer support. Not sure what the conditions are, whether it’s free or not, and have inspected neither the texts nor tried out the software.
    [Free software ? Free book ? Online ✗ Open source ✗]

Do you know of anything else that could be used to teach a course with an online or electronic component? Any experience with the options above?

Graphing Survey Responses

As I reported last year, we’ve been running surveys in our classes that use open logic textbooks. We now have another year of data, and I’ve figured out R well enough to plot the results. Perhaps someone else is in a similar situation, so I’ve written down all the steps. Results aren’t perfect yet. All the data and code is on Github, and any new discoveries I make will be updated there.

What follows is the content of the HOWTO:

As part of two Taylor Institute Teaching & Learning Grants, we developed course materials for use in Calgary’s Logic I and Logic II courses. In the case of Logic I, we also experimented with partially flipping the course. One of the requirements of the grants was to evaluate the effectiveness of the materials and interventions. To evaluate the textbooks, we ran a survey in the courses using the textbooks, and in a number of other courses that used commercial textbooks. These surveys were administered through SurveyMonkey. To evaluate the teaching interventions, we designed a special course evaluation instrument that included a number of general questions with Likert responses. The evaluation was done on paper, and the responses to the Likert questions were entered into a spreadsheet.

In order to generate nice plots of the results, we use R. This documents the steps taken to do this.

Installing R, RStudio, and likert

We’re running RStudio, a free GUI frontend to R. In order to install R on Ubuntu Linux, we followed the instructions here, updated for zesty:

  • Start “Software & Updates”, select add a source, enter the line
    http://cran.rstudio.com/bin/linux/ubuntu zesty/
    

    Then in the command line:

    $ sudo apt-get install r-base r-base-dev
    
  • We then installed RStudio using the package provided here. The R packages for analyzing Likert data and plotting them require devtools, which we installed following the instructions here:
    $ sudo apt-get install build-essential libcurl4-gnutls-dev libxml2-dev libssl-dev
    $ R
    > install.packages('devtools')
    
  • Now you can install the likert package from Github:
    > install_github('likert', 'jbryer')

Preparing the data

The source data comes in CSV files, teachingevals.csv for the teaching evaluation responses, and textbooksurvey.csv for the textbook survey responses.

Since we entered the teaching evaluation responses manually, it was relatively simple to provide them in a format usable by R. Columns are Respondent ID for a unique identifier, Gender (M for male, F for female, O for other), Major, Year, Q1 through Q9 for the nine Likert questions. For each question, a response of one of Strongly Agree, Agree, Neutral, Disagree, or Strongly Disagree is recorded.

For the textbook survey we collected a whole lot of responses more, and the data SurveyMonkey provided came in a format not directly usable by R. We converted it to a more suitable format by hand.

  • SurveyMonkey results have two header lines, the first being the question, the second being the possible responses in multiple-response questions. We have to delete the second line. For instance, a question may have five different possible responses, which correspond to five columns. If a box was checked, the corresponding cell in a response will contain the answer text, otherwise it will be empty. In single-choice and Likert responses, SurveyMonkey reports the text of the chosen answer. For analysis, we wanted a simple 1 for checked and 0 for unchecked, and a number from 1 to 5 for the Likert answers. This was done easily enough with some formulas and search-and-replacing.
  • Since the question texts in the SurveyMonkey spreadsheet don’t make for good labels for importing from CSV, we replaced them all by generic labels such as Q5 (or Q6R2, for Question 6, Response 2, for multiple-choice questions).
  • We deleted data columns we don’t need such as timestamps and empty colums for data we didn’t collect such as names and IP addresses.
  • We added columns so we can collate data more easily: Section to identify the individual course the data is from, Course for which course it is (PHIL279 for Logic I, PHIL379 for Logic II), Term for Fall or Winter term, Open to distinguish responses from sections using an open or a commercial text, and Text for the textbook used. Text is one of SLC (for Sets, Logic, Computation, BBJ (for Boolos, Burgess, and Jeffrey, Computability and Logic), ForallX (for forall x: Calgary Remix, Chellas (for Chellas, Elementary Formal Logic), or Goldfarb (for Goldfarb, Deductive Logic). This was done by combining multiple individual spreadsheets provided by SurveyMonkey into one. (One spreadsheet contained responses from three different “Email Collectors”, one for each section surveyed.) Q27GPA contains the answer to Question 27, “What grade do you expect to get?”, converted to a 4-point grade scale.
  • Question 23, “Is the price of the textbook too high for the amount of learning support it provides?”, had the same answer scale as other questions (“Not at all” to “Very much so”), but the “Not at all” is now the positive answer, and “Very much so” the negative answer. To make it easier to produce a graph in line with the others, I added a Q23Rev column, where the values are reversed (i.e., Q23Rev = 6 – Q23).
  • Q26 is the 4-letter code of the major reported in the multiple-choice question 26, and Q26R1 to Q26R8 are responses to the checkboxes corresponding to options “Mathematics”, “Computer Science”, “Physics”, “Philosophy”, “Engineering”, “Neuroscience”, “Other”, and the write-in answer for Other. These responses don’t correspond to the questions asked: we offered “Lingustics” as an answer but noone selected it. A number of “Other” respondents indicated a Neuroscience major. So Q26R6 is NEUR in Q26. Question 26 allowed multiple answers, Q26 is the first answer only.

Loading data into R

In order to analyze the Likert data, we have to tell R which cells contain what, set the levels in the right order, and rename the columns so they are labelled with the question text instead of the generic Q1 etc. We’ll begin with the teaching evaluation data. The code is in teachingevals.R. Open that file in RStudio. You can run individual lines from that file, or selections, by highlighting the commands you want to run and then clicking on the “run” button.

First we load the required packages. likert is needed for all the Likert stuff; plyr just so we have the rename function used later; and reshape2 for the melt function.

require(likert)
require(plyr)
library(reshape2)

Loading the data from a CSV value file is easy:

data <- read.csv("teachingeval.csv",
                na.string="")

Now the table data contains everything in our CSV file, with empty cells having the NA value rather than an empty string. We want the responses to be labelled by the text of the question rather than just Q1 etc.

data <- rename(data, c(
  Q1 = "In-class work in groups has improved my understanding of the material", 
  Q2 = "Collaborative work with fellow students has made the class more enjoyable", 
  Q3 = "Being able to watch screen casts ahead of time has helped me prepare for class", 
  Q4 = "Having lecture slides available electronically is helpful", 
  Q5 = "I learned best when I watched a screencast ahead of material covered in class", 
  Q6 = "I learned best when I simply followed lectures without a screencast before", 
  Q7 = "I learned best studying material on my own in the textbook", 
  Q8 = "This course made me more likely to take another logic course", 
  Q9 = "This course made me more likely to take another philosophy course"))

The Likert responses are in colums 5-13, so let’s make a table with just those:

responses <- data[c(5:13)]

The responses table still contains just the answer strings; we want to tell R that these are levels, and have the labels in the right order (“Strongly Disagree” = 1, etc.)

mylevels <- c('Strongly Disagree', 'Disagree', 'Neutral', 'Agree', 'Strongly Agree')

for(i in seq_along(responses)) {
  responses[,i] <- factor(responses[,i], levels=mylevels)
}

Analyzing and Plotting

Now we can analyze the likert data.

lresponses <- likert(responses)

You can print the analyzed Likert data:

> lresponses
  Item
1          In-class work in groups has improved my understanding of the material
2      Collaborative work with fellow students has made the class more enjoyable
3 Being able to watch screen casts ahead of time has helped me prepare for class
4                      Having lecture slides available electronically is helpful
5  I learned best when I watched a screencast ahead of material covered in class
6     I learned best when I simply followed lectures without a screencast before
7                     I learned best studying material on my own in the textbook
8                   This course made me more likely to take another logic course
9              This course made me more likely to take another philosophy course
  Strongly Disagree  Disagree   Neutral    Agree Strongly Agree
1          1.785714  5.357143 10.714286 37.50000      44.642857
2          1.785714  0.000000 10.714286 37.50000      50.000000
3          8.928571 14.285714 26.785714 28.57143      21.428571
4          1.785714  1.785714  5.357143 37.50000      53.571429
5          7.142857 10.714286 37.500000 33.92857      10.714286
6          3.571429 19.642857 51.785714 21.42857       3.571429
7          3.571429 12.500000 23.214286 33.92857      26.785714
8         20.000000 10.909091 32.727273 27.27273       9.090909
9         16.363636 18.181818 38.181818 18.18182       9.090909

And now we plot it:

plot(lresponses,
  ordered=FALSE,
  group.order=names(responses),
  colors=c('darkred','darkorange','palegoldenrod','greenyellow','darkgreen')) +
  ggtitle("Teaching Evaluations")

The group.order=names(responses) makes the lines of the plot sorted in the order of the questions, you need ordered=FALSE or else it’ll be ordered alphabetically. Leave those out and you get it sorted by level of agreement. You can of course change the colors to suit.

In textbooksurvey.R we do much of the same stuff, except for the results of the textbook survey. Some sample differences:

Here’s how to group charts for multiple questions by textbook used:

lUseByText <- likert(items=survey[,27:31,drop=FALSE],
                 grouping=survey$Text)
plot(lUseByText, 
  ordered=TRUE,
  group.order=c('SLC','BBJ','ForallX','Chellas','Goldfarb'),
  colors=c('darkred', 'darkorange', 'palegoldenrod','greenyellow','darkgreen')
  ) + 
  ggtitle("Textbook Use Patterns")

To plot a bar chart for a scaled question, but without centering the bars, use centered=FALSE:

lQ5byText <- likert(items=survey[,26,drop=FALSE],
                   grouping=survey$Text)
plot(lQ5byText, 
  ordered=TRUE,
  centered= FALSE,
  group.order=c('SLC','BBJ','ForallX','Chellas','Goldfarb'),
  colors=c('darkred','darkorange', 'gold', 'palegoldenrod','greenyellow','darkgreen')
  ) +
  ggtitle("Textbook Use Frequency")

Plotting Bar Charts for Multiple-Answer Questions

Some of the questions in the textbook survey allowed students to check multiple answers. We want those plotted with a simple bar chart, grouped by, say, the textbook used. To do this, we first have to the data for that. First, we extract the responses into a new table.

Q1 <- survey[,c(6,7:13)]

Now Q1 is just the column Text and Q1R1 through Q1R7. Next, we sum the answers (a checkmark is a 1, unchecked is 0, so number of mentions is the sum).

Q1 <- ddply(Q1,.(Text),numcolwise(sum))

Next, we convert this to “long form”:

Q1 <- melt(sumQ1,id.var="Text")

Now Q1 has three columns: Text, variable, and value. Now we can plot it:

ggplot() + 
  geom_bar(
    aes(x=Text,fill=variable,y=value),
    data=Q1,
    stat="identity") + 
  coord_flip() +
  ggtitle("01. How do you access the textbook?") +
  theme(legend.position = "bottom",
        axis.title.x = element_blank()) +
  guides(fill=guide_legend(title=NULL,ncol=1))

This makes a bar chart with Text on the x-axis, stacking variable, and using values for the value of each bar. stat="identity" means to just use value and not count. coord_flip() makes it into a horizontal chart. ggtitle(...) adds a title, theme(...) puts the legend on the bottom and removes the x axis label, and guides(...) formats the legend in one column.

UPDATE: Better Visualization of Multiple-Answer Responses

I figured out a better way to visualize multiple-answer responses (thanks to Norbert Preining for the help!). You don’t want the number of respondents which checked a box, but the percentage of all respondents (in a category) who did, so instead of adding up a column you compute the mean for it. Also, aggregate is an easier way to do this, and it doesn’t make sense to stack the responses, so I’m going to graph them side-by-side.

Here’s the code:

# load responses for question 4 into df Q4
Q4 <- survey[,c(6,20:25)]

# aggregate by Text, computing means = percent respondents who checked box
Q4 <- aggregate( . ~ Text, data=Q4, mean)

# make table long form for ggplot
Q4 <- melt(Q4,id.var="Text")

ggplot() + 
  geom_bar(
    aes(x=variable,fill=Text,y=value),
    data=Q4,
    stat="identity", position="dodge") + 
  coord_flip() +
  ggtitle("04. When using the text in electronic form, do you....") +
  theme(legend.position = "bottom",
        axis.title.x = element_blank()) +
  guides(fill=guide_legend(title=NULL,ncol=1)) +
  scale_fill_brewer(palette="Dark2") +
  scale_y_continuous(labels = scales::percent)