How Can We Help Autistic Children Through Games?

Serious games are a great way to teach children indirectly. In the specific case of autism, serious games can be utilized to enhance 3 main areas:

Emotion Recognition: The ability to identify a certain emotion from a single subject or distinguish between 2 or more emotions
Emotion Expression: The ability to express a certain emotion comprehendable for other people
Emotion Regulation: the ability to choose and express appropriate emotion in a given situation.

The gameplay of our serious game will be divided into various scenarios based on the 3 stated criterias.

Emotion Recognition: Single Subject Scenario

The most simple scenario for emotion recognition is asking the player to choose the objects which express a given target emotion. The gamplay of this scenario is like gamaes “chicken hunter” on nintendo DS and “Virtua Cop”. There are many objects on the screen and the player should click on the correct object to gain points and clicking on wrong objects will lead to a punishment.

Emotion Recognition: Multiple object distinguishing

A more complicated scenario for training emotion recognition is distinguishing multiple objects and matching the ones with a common emotion. A well-known gameplay scenario for this task is card matching game. This game focuses on short-term memory training and the goal is for the player is to match all the pairs with the same image or color.

We modified the card matching scenario so that the player can match the common emotions in addition to full similarity. This way the player is encouraged to focus on the emotions for faster advancement in each set of cards.

Emotion Expression

Emotion exprssion section in the original EmoGalaxy game was done through a gate barrier where the player had to express a certain emotion in order to enter the next section. This gate mechanism was also used in a narrative scenario where the player had to express the same emotion as the characters in the story.

Computer Vision and Emotion Classification

The original face recognition plugin was commercialized and inaccessible, So we had to develop a new method to execute emotion recognition.

After doing some research, we figured out that unity released a package called “Barracuda” that could utilize neural network models for inference. It can get an ONNX network and run the inference procedure in a separate thread using workers. It also enables the developer to view the model architecture in unity inspector.

Emotion recognition comes down to two separate tasks, the first is detecting the face of the user from the live feed of the device webcam, and the other one is inputting the extracted face image to the classifier model.

To achieve this task we use two Networks, One for face recognition and one for emotion classification:

The Face detection network is a pre-trained which is the same one used in MoodMe Demo app , this model is capable of detecting multiple faces and gives the constrainst of each face and the score of them as the model output. the overall accuracy of the face detection model is 90%.
The Emotion classifier is a model trained on FER dataset in pytorch and exported as an ONNX model. this model takes a 48x48 image vector as input and gives a 7x1 vector of the emotion scores as output. The output emotions are [happy, sad, angry, scared, surprised, disgusted, neutural]. This model can classify emotions with 65% accuracy.

Implementation in Unity

Face Detection

For the gate to work with the neural network there are a few steps we should do. First the webcam feed must be captured and converted to a texture, to do this we take the built-in Unity datatype CameraTexture and apply the captured colors to a Texture2D variable, then we copy the texture into an output buffer to use it in other scripts.

1
2
3
4
CameraTexture.GetPixels32(GetPixels);
((Texture2D)WebcamTexture).SetPixels32(GetPixels);
((Texture2D)WebcamTexture).Apply();
Graphics.CopyTexture(WebcamTexture, ExportWebcamTexture);

By running this code in a loop we can access the live feed of the device webcam.

The next step is to resize the texture to 320x240 for the input of the face detector model and infering from the model with a worker which is a thread creator in Unity Barracuda package. Then, by getting the output boxes and their scores with PeekOutput method and cropping the box with the highest score from the original image, the face of the player is isolated. The result image needs to be resized into a 48x48 mono-channel to be used in the Emotion Classifier model.

Tags: unity games development

Emogalaxy 2

Designing a Serious Game for Autistic Childred