Nezumi Workbench: November 2013

Monday, November 4, 2013

More Image Sharpness

ImageMagick has FFT build into it! You just use the appropriate command line options in ImageMagic's Convert application. I found ImageMagick's FFT documentation really helpful in understanding 2d FFTs of images. I just takes two command lines. The first performs the FFT and creates a magnitude and a phase plot.

convert ohelo.png -fft fft_ohelo.png

The phase plot's not really useful to us, so we'll just take the magnitude plot, which is the one that has a -0 appended to its name. Because it's plotted on a linear scale, it doesn't look like much - almost entirely black. But if we auto-scale it, and plot logarithmically, a pattern emerges.

convert fft_ohelo-0.png -auto-level -evaluate log 10000000 fft_ohelo_10000000.png

You may have to play around with the log scaling. It took me several tries to get something usable.

Here is the set of test images:

Straight Out of the Camera

5 Pixel Blur

25 Pixel Blur

It's hard to tell the difference between the original and the 5 pixel blur, but look at the FFTs.

Straight Out of the Camera

5 Pixel Blur

25 Pixel Blur

The white areas farther from the center have the highest spatial frequencies and therefore are the sharpest. Blurring the image filters those out and results in a smaller circle. I'm not sure I understand the horizontal and vertical artifacts, but I think that may have something to do with the rectangular arrangement of the pixels.

To make this more quantifiable, I next plan to use ImageMagic's Sample option to grab radial lines at 5 degree increments between 5 and 85 degrees. Because of the symmetry of the quadrants, it's only necessary to sample from one of them. I'll then average the samples and plot them on an X-Y axis.

Also, to make this real, I need to take the pictures with two different cameras or lenses. I think the subject should be something natural like leaves because man-made objects in some examples I've seen tend to have strong diagonal artifacts.

A later post will explore some of these refinements and possibly add some Ruby automation to the process.

Sunday, November 3, 2013

How Sharp is That Lens?

As I read reviews of photographic equipment, I find the occasionally come across a review in which the reviewer notes that the reviewed lens was a dud and didn't focus properly. How could I tell if I had a dud lens without a way to compare it to others? Some kind of quantitative method for lens comparison is needed. There are all kinds of sharpness test patterns, but none of them seemed to be easy-to-use. Then I saw an example of a pattern of black and white bars that got progressively closer together. You can tell how sharp the lens is by looking at where the bars mush together to form gray. I looked at this and realized that this is the spacial equivalent of the swept-sine frequency response test used for audio equipment. What if I approached this like a signal processing problem? A basic test of signal processing equipment is the step-response. On a first-order system you can use this to determine its time constant, which is a fundamental metric. This could be the metric I was looking for? I used a laser printer to make a sheet of paper that was half black and white. I took three pictures with an old point and shoot camera: wide, tele, and purposely blurred. The photos were taken at maximum resolution.

WIDE

TELE

BLUR

Sampling a line pixels from left to right in the middle each image should result in a step response from black to white. To extract the pixel values I used a Ruby script with RMagick:

require 'RMagick'
include Magick
puts ARGV[0]

image = ImageList.new(ARGV[0])
midPointY = image.rows / 2

(0..image.columns).each do |x|
      pixel = image.pixel_color(x,midPointY )
    print x
    print ", "
    print (pixel.red + pixel.green + pixel.blue) / 3
    print "\n"
end

RMagick is so cool! I sent the output of the file to a spreadsheet to compare the three images.

Look at this! The wide angle has the fastest rise time. You can even see a little 2nd order ringing that's probably due to the compression algorythm. Interestingly there's pre-ringing too because spacial systems are non-causal.

Note that the wide angle is not sharper because it had to be closer to the paper to fill the frame. I've noticed that this 12 year-old camera is just not as sharp on the telephoto setting as it used to be. This graph quantified my observation. How much of a difference is there between wide and tele? We'll zoom in on the data.

In a first-order system, the time constant is measured at 63% of the final value of the step function. There are a couple of sample points in that area, so I'll use those as an approximation rather than interpolating an exact point. Now we can say that the wide-angle setting is more than twice as sharp as the telephoto setting.

There are still some questions I'd like to answer. How can I compare cameras with different pixel resolution? Can I harmonize my results somehow with the lensmakers' specs? Could I perform an average of successive images to get a better accuracy? How could I compare the center of the lens to the edges? Would deconvolution or FFT be useful analysis tools? These are all questions for a later blog post!

Saturday, November 2, 2013

Transcription Controller in an Afternoon

I have some audio files that I need to transcribe. I figured it would be easy just to load them up in the Audacity audio editor and type away. Not so easy. People talk much faster than I can type, and it's hard to control Audacity while trying to type on the word processor. Fortunately, Audacity has keyboard short-cuts. I just need a way to connect a foot pedal to Audacity.

An old PS2 mouse makes a decent foot pedal. I gutted the unit, removing the scroll wheel, and then wired the mouse buttons to the I/O cable.

I then cut off the PS2 connector, and wired it to pin 2 of an Arduino. I also added a 10K pull-up resistor. Here's what it looks like assembled and connected.

The Arduino was programmed to send the following text strings:

15 seconds after boot: "g"
mouse down: "0"
mouse up: "1"

Here's the code (adapted from Arduino Playground):

// digital pin 2 has a pushbutton attached to it.
int pushButton = 2;
// the setup routine runs once when you press reset:
void setup() {
// initialize serial communication at 9600 bits per second:
Serial.begin(9600);
// make the pushbutton's pin an input:
pinMode(pushButton, INPUT);
delay(15000);
}
void loop() {
Serial.println("g");
int initButtonState=digitalRead(pushButton);
//loop forever
while(1)
{
    // read the input pin:
    int buttonState = digitalRead(pushButton);
    if(initButtonState != buttonState)
      {
        // print out the state of the button:
        Serial.println(buttonState);
        //debounce
        delay(5);
      }
     initButtonState=buttonState;
   }
}

I decided to code this in Python because it's a pretty fun and easy language with lots of libraries. But, the first thing I needed was X-windows automation and there seem to be a lot of choices. Even though it's been replaced by Xaut, I found Xautomation worked for me. I got Python and Xautomation from the Linux Mint Software Library, but I could have got them as easily using apt-get.

For each of the received characters I used Xautomation to sent the following key strokes.

g = space p (start playback and pause)
0 = p (un-pause)
1 = comma comma comma comma comma p (back up a little, then pause)

The last link was the serial link connecting the Arduino to the Python code. pySerial looked like a good library, and to get it I would need python-pip.

sudo apt-get install python-pippip pySerial

pySerial didn't work at first. I found I had to execute the following commands.

sudo usermod -a -G dialout tester
sudo chmod 777 /dev/ttyACM0

The first command gives you permission to access serial I/O. The second gives you permission to use the particular USB device. Unless you have put these settings in a bash script, you'll have to execute them every time you run the program. Also, depending on your hardware, your USB device may have a different name (like ttyUSB0). The short-cut way to getting pySerial working would be to run Python as root, which is a very bad idea, however.

Here's the code with all in all its ugliness:

# serial_read_keys.py
import time
import serial
from subprocess import Popen, PIPE

control_f4_sequence = '''keydown Control_L
key F4
keyup Control_L
'''

shift_a_sequence = '''keydown Shift_L
key A
keyup Shift_L
'''

initialize_sequence = '''key space
key P
'''

play_sequence = '''key space
'''

unpause_sequence = '''key P
'''

pause_sequence = '''key P
'''

backup_sequence = '''key comma
'''

def keypress(sequence):
    p = Popen(['xte'], stdin=PIPE)
    p.communicate(input=sequence)

ser = serial.Serial('/dev/ttyACM0',9600)

while (1) :
        #print 'reading line'
        rcvChar = ser.readline()
        # print rcvChar
        if 'g' in rcvChar :
            print 'initialize - play and pause'
            keypress(play_sequence)
            time.sleep(0.1)
            keypress(pause_sequence)
        if '0' in rcvChar :
            print 'unpause'
            keypress(unpause_sequence)
        if '1' in rcvChar :
            print 'backup a little then pause'
            keypress(backup_sequence)
            time.sleep(0.1)
            keypress(backup_sequence)
            time.sleep(0.1)
            keypress(backup_sequence)
            time.sleep(0.1)
            keypress(backup_sequence)
            time.sleep(0.1)
            keypress(backup_sequence)
            time.sleep(0.1)
            keypress(pause_sequence)

I had to do some experimentation, and I left all of that in there so I could document what I had learned.

To do transcription, first open your audio file with Audacity. You may want to use the Effect, Change Tempo menu item to slow down the play-back. Now start the Python script. You have 15 seconds to do the following: make sure the Audacity stop button is clicked, then click on the waveform you want to transcribe.

After 15 seconds, the script will click the play button then immediately click pause. Don't touch anything on your screen again. If you do, it will lose focus and the key-presses won't go to Audacity. So, how are you supposed to type the transcription then? Use another computer! I neglected to tell you that, didn't I?

Go to the other computer, mash down on the mouse with you foot and the audio will begin to play. Release the mouse and the audio will back up about 5 seconds and then pause. Why does it back up before pausing? So you can more easily sync up your typing. If you want to back up more double click the mouse.

One unexpected nice feature I found is that when you start the script, it reboots the Arduino, so you don't have to reach down and press the reset button.