![]() ![]() The actual Transformer architecture GPT-2 uses is very complicated to explain (here’s a great lecture). These models are much larger than what you see in typical AI tutorials and are harder to wield: the “small” model hits GPU memory limits while finetuning with consumer GPUs, the “medium” model requires additional training techniques before it could be finetuned on server GPUs without going out-of-memory, and the “large” model cannot be finetuned at all with current server GPUs before going OOM, even with those techniques. OpenAI has released three flavors of GPT-2 models to date: the “small” 124M parameter model (500MB on disk), the “medium” 355M model (1.5GB on disk), and recently the 774M model (3GB on disk). ![]() Thanks to gpt-2-simple and this Colaboratory Notebook, you can easily finetune GPT-2 on your own dataset with a simple function, and generate text to your own specifications! How GPT-2 Works Enter gpt-2-simple, a Python package which wraps Shepperd’s finetuning code in a functional interface and adds many utilities for model management and generation control. ![]() I waited to see if anyone would make a tool to help streamline this finetuning and text generation workflow, a la textgenrnn which I had made for recurrent neural network-based text generation. From there, the proliferation of GPT-2 generated text took off: researchers such as Gwern Branwen made GPT-2 Poetry and Janelle Shane made GPT-2 Dungeons and Dragons character bios. ![]() A notebook was created soon after, which can be copied into Google Colaboratory and clones Shepperd’s repo to finetune GPT-2 backed by a free GPU. Neil Shepperd created a fork of OpenAI’s repo which contains additional code to allow finetuning the existing OpenAI model on custom datasets. At the same time, the Python code which allowed anyone to download the model (albeit smaller versions out of concern the full model can be abused to mass-generate fake news) and the TensorFlow code to load the downloaded model and generate predictions was open-sourced on GitHub. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |