Person detection with!MicroPython
Even though the final goal of this demo is to get images from a camera and to decide on the spot if there is a person in sight, the first step is to test the model with well known test images. We use a pre-trained and quantized model. Re-creating the model and training it is non-trivial because the model provided has been built with an outdated version of TensorFlow. It would be an interesting task to build a new Keras model and train it.
You can find an example of the person detection demo in the sources of TFLM and a different version for the esp-idf and the Arduino SDK. For MicroPython I tried to port both versions and compared the results with their C++ originals.
The TFLM version
Here only 2 test images are used: a person and a giraffe:
|
|
person |
no person |
The result from the invocation of the model returns a score value of -128 .. 127, which can be converted to a probability value through de-quantization. The microlite module provides the quantizeInt8ToFloat which accomplishes this task.
Here is the result from the MicroPython program:
As you can see, the results are correct and the confidence levels are pretty high.
The esp-idf and Arduino version of the person detection model
These versions are pretty complex and they incorporate
- inference on test images
- camera readout
- display on a TFT screen
all in a single multi-tasking program. As the hardware used is different from my boards I decided to disentangle these components and to provide a separate program for the use of test images and the use of the camera. The treatment of the test images can be done on both CPU boards, while the camera version is of course on possible with the FreeNove board incorporating the OV2640 camera.
Here are the test images used in the example:
|
|
|
|
|
image 0 |
image 1 |
image 2 |
image 3 |
image 4 |
|
|
|
|
|
image 5 |
image 6 |
image 7 |
image 8 |
image 9 |
The images are provided as binary files with 96x96 uint8_t gray scale pixel values. Th visualize them I wrote a short Python script (show-test-image.py) and another one (pixel2png.py) to convert the binary files into png images.
Before passing the pixels into the input tensor for the model invocation the pixel values must be converted from uint8_t to int8_t expected be the model.
The program supplies its results in the above summary table. You may judge yourself, how well the person detection works. It seems that after all the difference of a person and a monkey is not that great!
--
Uli Raich - 2023-11-08
Comments