Using AI to invent new Chinese characters
What do we get when we apply a simple deep convolutional GANs network on Chinese handwriting data? From a dataset of just 178 unique “hanzi” characters and 25k images, what latent features of Chinese script will an AI learn to recreate?
GANs machine learning models have been making headlines recently, especially for generating hyper-real faces (This Person Does Not Exist). So, I wanted to build a GAN network of my own.
What are GANs?
GANs stand for Generative Adversarial Networks. This model uses two independent neural agents who compete against each other. One agent is tasked with learning to create convincing forgeries, while the other is tasked with learning to catch forgeries. Coupled together, these adversaries can learn to generate incredibly convincing fake data.
Setting it up
This experiment is a spinoff from TensorFlow’s Deep Convolution Generative Adversarial Network tutorial. That walkthrough specifically discusses generating images of handwritten Arabic numerals. TensorFlow is a fantastic resource for open-source machine learning, and their website covers tutorials for all sorts of trending machine learning models.
The input data source was a subset of the comprehensive CASIA Chinese Handwriting Database. Specifically, it is the 178 characters of the HSK 1 test, prepared by Kaggle user Vitalii Kyzym. Using just 178 characters out of the thousands available was a matter of reducing the data storage requirements. Shown below is a sample of the input images:
To follow along or replicate the results of this model, open this notebook in Colab.
Various improvements were made to the DCGANs structure to improve learning for more complex images of Chinese characters. This included adjusting the layer sizes to fit more features in the input space and adding more convolutional layers to recognize more features of Chinese handwriting. In other words, the network was scaled up because at first glance Chinese characters tend to be more complicated to generate and discern than numbers 0–9.
Note that 300 epochs of training were achieved in 3 hours with the use of a Google Colab GPU as a part of a Colab Pro account. Training on a GPU is highly recommended. Especially on a free Colab account, if runtime is unchanged from CPU, the training becomes several magnitudes slower. For machine learning, GPU training is highly preferred over CPU.
Reviewing the results
Epoch 0
The neural network starts learning from literal random noise. It will eventually devise mathematical operations to transform this random noise into what it thinks Chinese looks like.
Epoch 12
The AI now understands Chinese as some vertical black things on a white background.
Epoch 28
Combinations of vertical and horizontal strokes emerge.
Epoch 37
Familiar shapes begin to emerge. One could potentially recognize a 说 or 品.
Epoch 65
More complex strokes emerge, resembling the hook of 或.
Epoch 88
Composition of strokes grow increasingly sophisticated and convincing. From a distance, I could think this looks like Chinese.
Epoch 173
The neural network generates a 土 and a 王.
Epoch 195
Are those radicals we are seeing? (Like the left portions of 很, 佷, and 恨.)
Epoch 266
No obvious improvements in the model from this point forward. Many uncanny shapes and forms in the 16-character sample: 基, 之, 书, 奴, and 店.
The style of Chinese
Chinese and English script differ aesthetically at a fundamental level. The English alphabet is primarily derived from languages with long histories of being carved into stone or marble. Chinese, on the other hand, was a language that evolved out of a history of being brushed with ink. It is for that reason that the alphabet features simple, straight lines or curves while Chinese “hanzi” commonly contain many fluid strokes.
It is remarkable to see a simple neural network find these latent features and replicate them. This has been my first use of my new Google Colab Pro subscription, and I’m looking forward to doing more with it in the near future!