Modifying sensitivity to flag unlikely answers

Added by Stefan Reinsberg over 9 years ago

I am noticing that a lot of 'mis-recognitions' are due to students inadequately erasing one answers and choosing a different one (where a correction fluid or tape would have worked really nicely). Another 'mis-recognition' that's easy to diagnose is when students use a very faint pen below the darkness threshold and consequently appear to not have chosen any answer at all. When one looks at the openoffice export these are very nicely flagged in yellow and red.

However, from a useability point of view I would like to see these problems already during the manual recognition stage. I was thinking a simple way to do this without any changes to the UI would be to bump up the sensitivity parameter on a page where there are answers ticked where only one is allowed or where none are ticked although there should be one. Where would I find this sensitivity threshold so that one could quickly identify these problems without having to rebuild go back and forth between the spreadsheet and the Manual Mark Recognition window? Possibly there would have to be an iterative round where the 'Marking' has to be performed to find the instances of unlikely answers.


Replies (2)

RE: Modifying sensitivity to flag unlikely answers - Added by Alexis Bienvenüe over 9 years ago

I'm not sure I understand your idea.
About recognition of ticked boxes:
  • AMC first converts grayscale images to black-and-white, applying a threshold (Edit/Preferences/Scan/Scans conversion/Black&white conversion threshold). If light grays are mis-converted to white, you can increase this value.
  • Then AMC counts in each box the number of black pixels, and compares the darkness ratio (number of black pixels)/(total number of pixels) to the darkness threshold (Edit/Preferences/Project/Automatic data capture/Darkness threshold) do decide if the box is considered as ticked or not.
  • AMC computes a sensitivity indicator for each page, which is near 10 (its maximum value) if in this page some boxes have a darkness ratio very near the darkness threshold. A great value of sensitivity indicates that a little modification of the darkness threshold can change the results of automatic data capture, and a low value of sensitivity tells us that a moderate change of darkness threshold won't change the results.
So sensitivity (as defined in the UI) is not a parameter that you can change. Perhaps you meant darkness threshold?
My workflow is the following:
  • In Data capture/Diagnosis table, sort the rows by decreasing sensitivity.
  • Click on the first row, open the zooms window and check and correct the recognition of ticked boxes.
  • Move to the next row, until sensitivity is low, or until you did not change anything to recognition in the last (say) 10 rows.
  • Mark, annotate sheets, send them to the students.
  • Tell the students to check that recognition was OK on their sheets — and correct if necessary.

As you have noticed, the problem here is that we are not aware of "coherence problems" (two or more boxes checked on a simple question, or no boxes ticked at all) when correcting the recognition from the zooms window.
Perhaps I can add a column to the Diagnosis table with this data, but it is available only after marking.
Can you clarify the solution you propose, so I can understand it better?

RE: Modifying sensitivity to flag unlikely answers - Added by Stefan Reinsberg over 9 years ago

Yes - so what I meant was to add the column you referred to as Diagnosis. I understand that this would only be available after marking. That would mean your workflow would require to go back to step 3 ("Move to next row, ..."after having gone through the marking.

The addition of a column (diagnosis based on coherence) would be the best solution. I simply thought that it could be subsumed in sensitivity (or in this case 'updated-post-marking' sensitivity.) so that the UI doesn't have to be touched. Of course that's not as clean a solution.

The bottom line is that I was hoping that catching "coherence problems" before my students report back from their annotated solutions will save me some work down the line. (by the way, I am looking at sending 4.8GB of mail later today for my entire class - I hope this will work out...)

(1-2/2)