Harmless unicorns thought-about dangerous? Tips on how to experiment with GPT-2 from R

By admin2010

August 1, 2025

115

When this yr in February, OpenAI introduced GPT-2(Radford et al. 2019), a big Transformer-based language mannequin educated on an infinite quantity of web-scraped textual content, their announcement caught nice consideration, not simply within the NLP group. This was primarily as a consequence of two information. First, the samples of generated textual content had been beautiful.

Introduced with the next enter

In a stunning discovering, scientist [sic] found a herd of unicorns dwelling in a distant, beforehand unexplored valley, within the Andes Mountains. Much more shocking to the researchers was the truth that the unicorns spoke excellent English.

this was how the mannequin continued:

The scientist named the inhabitants, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns had been beforehand unknown to science.
Now, after nearly two centuries, the thriller of what sparked this odd phenomenon is lastly solved.
Dr. Jorge Pérez, an evolutionary biologist from the College of La Paz, and a number of other companions, had been exploring the Andes Mountains once they discovered a small valley, with no different animals or people. Pérez observed that the valley had what seemed to be a pure fountain, surrounded by two peaks of rock and silver snow. […]

Second, “as a consequence of our considerations about malicious functions” (quote) they didn’t launch the complete mannequin, however a smaller one which has lower than one tenth the variety of parameters. Neither did they make public the dataset, nor the coaching code.

Whereas at first look, this may increasingly appear like a advertising and marketing transfer (we created one thing so highly effective that it’s too harmful to be launched to the general public!), let’s not make issues that straightforward on ourselves.

With nice energy …

No matter your tackle the “innate priors in deep studying” dialogue – how a lot information must be hardwired into neural networks for them to unravel duties that contain greater than sample matching? – there is no such thing as a doubt that in lots of areas, methods pushed by “AI” will affect
our lives in an important, and ever extra highly effective, method. Though there could also be some consciousness of the moral, authorized, and political issues this poses, it’s most likely honest to say that by and enormous, society is closing its eyes and holding its fingers over its ears.

If you happen to had been a deep studying researcher working in an space inclined to abuse, generative ML say, what choices would you may have? As all the time within the historical past of science, what might be completed might be completed; all that is still is the seek for antidotes. You might doubt that on a political degree, constructive responses may evolve. However you may encourage different researchers to scrutinize the artifacts your algorithm created and develop different algorithms designed to identify the fakes – primarily like in malware detection. In fact this can be a suggestions system: Like with GANs, impostor algorithms will fortunately take the suggestions and go on engaged on their shortcomings. However nonetheless, intentionally getting into this circle may be the one viable motion to take.

Though it might be the very first thing that involves thoughts, the query of veracity right here isn’t the one one. With ML methods, it’s all the time: rubbish in – rubbish out. What’s fed as coaching knowledge determines the standard of the output, and any biases in its upbringing will carry by to an algorithm’s grown-up habits. With out interventions, software program designed to do translation, autocompletion and the like might be biased.

On this mild, all we are able to sensibly do is – continuously – level out the biases, analyze the artifacts, and conduct adversarial assaults. These are the sorts of responses OpenAI was asking for. In applicable modesty, they referred to as their method an experiment. Put plainly, no-one at the moment is aware of how one can take care of the threats rising from highly effective AI showing in our lives. However there is no such thing as a method round exploring our choices.

The story unwinding

Three months later, OpenAI printed an replace to the preliminary publish, stating that they’d selected a staged-release technique. Along with making public the next-in-size, 355M-parameters model of the mannequin, in addition they launched a dataset of generated outputs from all mannequin sizes, to facilitate analysis. Final not least, they introduced partnerships with tutorial and non-academic establishments, to extend “societal preparedness” (quote).

Once more after three months, in a new publish OpenAI introduced the discharge of a but bigger – 774M-parameter – model of the mannequin. On the similar time, they reported proof demonstrating insufficiencies in present statistical faux detection, in addition to research outcomes suggesting that certainly, textual content turbines exist that may trick people.

As a result of these outcomes, they mentioned, no determination had but been taken as to the discharge of the largest, the “actual” mannequin, of measurement 1.5 billion parameters.

GPT-2

So what’s GPT-2? Amongst state-of-the-art NLP fashions, GPT-2 stands out as a result of gigantic (40G) dataset it was educated on, in addition to its huge variety of weights. The structure, in distinction, wasn’t new when it appeared. GPT-2, in addition to its predecessor GPT (Radford 2018), is predicated on a transformer structure.

The unique Transformer (Vaswani et al. 2017) is an encoder-decoder structure designed for sequence-to-sequence duties, like machine translation. The paper introducing it was referred to as “Consideration is all you want,” emphasizing – by absence – what you don’t want: RNNs.

Earlier than its publication, the prototypical mannequin for e.g. machine translation would use some type of RNN as an encoder, some type of RNN as a decoder, and an consideration mechanism that at every time step of output era, informed the decoder the place within the encoded enter to look. Now the transformer was disposing with RNNs, primarily changing them by a mechanism referred to as self-attention the place already throughout encoding, the encoder stack would encode every token not independently, however as a weighted sum of tokens encountered earlier than (together with itself).

Many subsequent NLP fashions constructed on the Transformer, however – relying on objective – both picked up the encoder stack solely, or simply the decoder stack.
GPT-2 was educated to foretell consecutive phrases in a sequence. It’s thus a language mannequin, a time period resounding the conception that an algorithm which may predict future phrases and sentences by some means has to perceive language (and much more, we’d add).
As there is no such thing as a enter to be encoded (aside from an non-obligatory one-time immediate), all that’s wanted is the stack of decoders.

In our experiments, we’ll be utilizing the largest as-yet launched pretrained mannequin, however this being a pretrained mannequin our levels of freedom are restricted. We are able to, after all, situation on completely different enter prompts. As well as, we are able to affect the sampling algorithm used.

Sampling choices with GPT-2

Every time a brand new token is to be predicted, a softmax is taken over the vocabulary. Immediately taking the softmax output quantities to most probability estimation. In actuality, nonetheless, all the time selecting the utmost probability estimate ends in extremely repetitive output.

A pure choice appears to be utilizing the softmax outputs as possibilities: As an alternative of simply taking the argmax, we pattern from the output distribution. Sadly, this process has detrimental ramifications of its personal. In an enormous vocabulary, very inconceivable phrases collectively make up a considerable a part of the chance mass; at each step of era, there may be thus a non-negligible chance that an inconceivable phrase could also be chosen. This phrase will now exert nice affect on what’s chosen subsequent. In that method, extremely inconceivable sequences can construct up.

The duty thus is to navigate between the Scylla of determinism and the Charybdis of weirdness. With the GPT-2 mannequin introduced under, we’ve three choices:

fluctuate the temperature (parameter temperature);
fluctuate top_k, the variety of tokens thought-about; or
fluctuate top_p, the chance mass thought-about.

The temperature idea is rooted in statistical mechanics. Wanting on the Boltzmann distribution used to mannequin state possibilities (p_i)depending on vitality (epsilon_i):

[p_i sim e^{-frac{epsilon_i}{kT}}]

we see there’s a moderating variable temperature (T) that depending on whether or not it’s under or above 1, will exert an both amplifying or attenuating affect on variations between possibilities.

Analogously, within the context of predicting the following token, the person logits are scaled by the temperature, and solely then is the softmax taken. Temperatures under zero would make the mannequin much more rigorous in selecting the utmost probability candidate; as a substitute, we’d be involved in experimenting with temperatures above 1 to present larger probabilities to much less probably candidates – hopefully, leading to extra human-like textual content.

In top-(okay) sampling, the softmax outputs are sorted, and solely the top-(okay) tokens are thought-about for sampling. The problem right here is how to decide on (okay). Generally just a few phrases make up for nearly all chance mass, wherein case we’d like to decide on a low quantity; in different instances the distribution is flat, and a better quantity can be enough.

This appears like relatively than the variety of candidates, a goal chance mass needs to be specified. That is the method advised by (Holtzman et al. 2019). Their technique, referred to as top-(p), or Nucleus sampling, computes the cumulative distribution of softmax outputs and picks a cut-off level (p). Solely the tokens constituting the top-(p) portion of chance mass is retained for sampling.

Now all you have to experiment with GPT-2 is the mannequin.

Setup

Set up gpt2 from github:

The R package deal being a wrapper to the implementation offered by OpenAI, we then want to put in the Python runtime.

gpt2::install_gpt2(envname = "r-gpt2")

This command will even set up TensorFlow into the designated atmosphere. All TensorFlow-related set up choices (resp. suggestions) apply. Python 3 is required.

Whereas OpenAI signifies a dependency on TensorFlow 1.12, the R package deal was tailored to work with extra present variations. The next variations have been discovered to be working effective:

if working on GPU: TF 1.15
CPU-only: TF 2.0

Unsurprisingly, with GPT-2, working on GPU vs. CPU makes an enormous distinction.

As a fast check if set up was profitable, simply run gpt2() with the default parameters:

# equal to:
# gpt2(immediate = "Hiya my identify is", mannequin = "124M", seed = NULL, batch_size = 1, total_tokens = NULL,
#      temperature = 1, top_k = 0, top_p = 1)
# see ?gpt2 for a proof of the parameters
#
# obtainable fashions as of this writing: 124M, 355M, 774M
#
# on first run of a given mannequin, permit time for obtain
gpt2()

Issues to check out

So how harmful precisely is GPT-2? We are able to’t say, as we don’t have entry to the “actual” mannequin. However we are able to evaluate outputs, given the identical immediate, obtained from all obtainable fashions. The variety of parameters has roughly doubled at each launch – 124M, 355M, 774M. The largest, but unreleased, mannequin, once more has twice the variety of weights: about 1.5B. In mild of the evolution we observe, what will we count on to get from the 1.5B model?

In performing these sorts of experiments, don’t neglect in regards to the completely different sampling methods defined above. Non-default parameters may yield extra real-looking outcomes.

For sure, the immediate we specify will make a distinction. The fashions have been educated on a web-scraped dataset, topic to the standard criterion “3 stars on reddit”. We count on extra fluency in sure areas than in others, to place it in a cautious method.

Most undoubtedly, we count on numerous biases within the outputs.

Undoubtedly, by now the reader may have her personal concepts about what to check. However there may be extra.

“Language Fashions are Unsupervised Multitask Learners”

Right here we’re citing the title of the official GPT-2 paper (Radford et al. 2019). What’s that alleged to imply? It signifies that a mannequin like GPT-2, educated to foretell the following token in naturally occurring textual content, can be utilized to “resolve” commonplace NLP duties that, within the majority of instances, are approached by way of supervised coaching (translation, for instance).

The intelligent concept is to current the mannequin with cues in regards to the process at hand. Some info on how to do that is given within the paper; extra (unofficial; conflicting or confirming) hints might be discovered on the web.
From what we discovered, listed here are some issues you could possibly attempt.

Summarization

The clue to induce summarization is “TL;DR:” written on a line by itself. The authors report that this labored finest setting top_k = 2 and asking for 100 tokens. Of the generated output, they took the primary three sentences as a abstract.

To do this out, we selected a sequence of content-wise standalone paragraphs from a NASA web site devoted to local weather change, the concept being that with a clearly structured textual content like this, it needs to be simpler to determine relationships between enter and output.

# put this in a variable referred to as textual content

The planet's common floor temperature has risen about 1.62 levels Fahrenheit
(0.9 levels Celsius) because the late nineteenth century, a change pushed largely by
elevated carbon dioxide and different human-made emissions into the environment.4 Most
of the warming occurred previously 35 years, with the 5 warmest years on file
going down since 2010. Not solely was 2016 the warmest yr on file, however eight of
the 12 months that make up the yr — from January by September, with the
exception of June — had been the warmest on file for these respective months.

The oceans have absorbed a lot of this elevated warmth, with the highest 700 meters
(about 2,300 ft) of ocean displaying warming of greater than 0.4 levels Fahrenheit
since 1969.

The Greenland and Antarctic ice sheets have decreased in mass. Information from NASA's
Gravity Restoration and Local weather Experiment present Greenland misplaced a median of 286
billion tons of ice per yr between 1993 and 2016, whereas Antarctica misplaced about 127
billion tons of ice per yr throughout the identical time interval. The speed of Antarctica
ice mass loss has tripled within the final decade.

Glaciers are retreating nearly in every single place world wide — together with within the Alps,
Himalayas, Andes, Rockies, Alaska and Africa.

Satellite tv for pc observations reveal that the quantity of spring snow cowl within the Northern
Hemisphere has decreased over the previous 5 a long time and that the snow is melting
earlier.

World sea degree rose about 8 inches within the final century. The speed within the final two
a long time, nonetheless, is sort of double that of the final century and is accelerating
barely yearly.

Each the extent and thickness of Arctic sea ice has declined quickly over the past
a number of a long time.

The variety of file excessive temperature occasions in america has been
rising, whereas the variety of file low temperature occasions has been reducing,
since 1950. The U.S. has additionally witnessed rising numbers of intense rainfall occasions.

Because the starting of the Industrial Revolution, the acidity of floor ocean
waters has elevated by about 30 p.c.13,14 This enhance is the results of people
emitting extra carbon dioxide into the environment and therefore extra being absorbed into
the oceans. The quantity of carbon dioxide absorbed by the higher layer of the oceans
is rising by about 2 billion tons per yr.

TL;DR:

gpt2(immediate = textual content,
     mannequin = "774M",
     total_tokens = 100,
     top_k = 2)

Right here is the generated consequence, whose high quality on objective we don’t touch upon. (In fact one can’t assist having “intestine reactions”; however to really current an analysis we’d need to conduct a scientific experiment, various not solely enter prompts but additionally perform parameters. All we need to present on this publish is how one can arrange such experiments your self.)

"nGlobal temperatures are rising, however the fee of warming has been accelerating.
nnThe oceans have absorbed a lot of the elevated warmth, with the highest 700 meters of
ocean displaying warming of greater than 0.4 levels Fahrenheit since 1969.
nnGlaciers are retreating nearly in every single place world wide, together with within the
Alps, Himalayas, Andes, Rockies, Alaska and Africa.
nnSatellite observations reveal that the quantity of spring snow cowl within the
Northern Hemisphere has decreased over the previous"

Talking of parameters to fluctuate, – they fall into two lessons, in a method. It’s unproblematic to fluctuate the sampling technique, not to mention the immediate. However for duties like summarization, or those we’ll see under, it doesn’t really feel proper to have to inform the mannequin what number of tokens to generate. Discovering the best size of the reply appears to be a part of the duty. Breaking our “we don’t decide” rule only a single time, we are able to’t assist however comment that even in much less clear-cut duties, language era fashions that are supposed to method human-level competence must fulfill a criterion of relevance (Grice 1975).

Query answering

To trick GPT-2 into query answering, the widespread method appears to be presenting it with various Q: / A: pairs, adopted by a last query and a last A: by itself line.

We tried like this, asking questions on the above local weather change – associated textual content:

q <- str_c(str_replace(textual content, "nTL;DR:n", ""), " n", "
Q: What time interval has seen the best enhance in international temperature? 
A: The final 35 years. 
Q: What is occurring to the Greenland and Antarctic ice sheets? 
A: They're quickly reducing in mass. 
Q: What is occurring to glaciers? 
A: ")

gpt2(immediate = q,
     mannequin = "774M",
     total_tokens = 10,
     top_p = 0.9)

This didn’t end up so properly.

"nQ: What is occurring to the Arctic sea"

However possibly, extra profitable tips exist.

Translation

For translation, the technique introduced within the paper is juxtaposing sentences in two languages, joined by ” = “, adopted by a single sentence by itself and a” =“.
Pondering that English <-> French is perhaps the mix finest represented within the coaching corpus, we tried the next:

# save this as eng_fr

The difficulty of local weather change considerations all of us. = La query du changement
climatique nous affecte tous. n
The issues of local weather change and international warming have an effect on all of humanity, in addition to
the complete ecosystem. = Les problèmes créés par les changements climatiques et le
réchauffement de la planète touchent toute l'humanité, de même que l'écosystème tout
entier.n
Local weather Change Central is a not-for-profit company in Alberta, and its mandate
is to cut back Alberta's greenhouse fuel emissions. = Local weather Change Central est une
société sans however lucratif de l'Alberta ayant pour mission de réduire les émissions
de gaz. n
Local weather change will have an effect on all 4 dimensions of meals safety: meals availability,
meals accessibility, meals utilization and meals methods stability. = "

gpt2(immediate = eng_fr,
     mannequin = "774M",
     total_tokens = 25,
     top_p = 0.9)

Outcomes assorted rather a lot between completely different runs. Listed below are three examples:

"ét durant les pages relevantes du Centre d'Motion des Sciences Humaines et dans sa
species situé,"

"études des loi d'affaires, des causes de demande, des loi d'abord and de"

"étiquettes par les changements changements changements et les bois d'escalier,
ainsi que des"

Conclusion

With that, we conclude our tour of “what to discover with GPT-2.” Take into account that the yet-unreleased mannequin has double the variety of parameters; primarily, what we see is just not what we get.

This publish’s aim was to point out how one can experiment with GPT-2 from R. But it surely additionally displays the choice to, once in a while, widen the slender deal with know-how and permit ourselves to consider moral and societal implications of ML/DL.

Thanks for studying!

Grice, H. P. 1975. “Logic and Dialog.” In Syntax and Semantics: Vol. 3: Speech Acts, 41–58. Educational Press. http://www.ucl.ac.uk/ls/studypacks/Grice-Logic.pdf.

Holtzman, Ari, Jan Buys, Maxwell Forbes, and Yejin Choi. 2019. “The Curious Case of Neural Textual content Degeneration.” arXiv e-Prints, April, arXiv:1904.09751. https://arxiv.org/abs/1904.09751.

Radford, Alec. 2018. “Bettering Language Understanding by Generative Pre-Coaching.” In.

Radford, Alec, Jeff Wu, Rewon Baby, David Luan, Dario Amodei, and Ilya Sutskever. 2019. “Language Fashions Are Unsupervised Multitask Learners.”

Solar, Tony, Andrew Gaut, Shirlyn Tang, Yuxin Huang, Mai ElSherief, Jieyu Zhao, Diba Mirza, Elizabeth M. Belding, Kai-Wei Chang, and William Yang Wang. 2019. “Mitigating Gender Bias in Pure Language Processing: Literature Evaluate.” CoRR abs/1906.08976. http://arxiv.org/abs/1906.08976.

Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. “Consideration Is All You Want.” In Advances in Neural Info Processing Techniques 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 5998–6008. Curran Associates, Inc. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf.

Harmless unicorns thought-about dangerous? Tips on how to experiment with GPT-2 from R

With nice energy …

The story unwinding

GPT-2

Sampling choices with GPT-2

Setup

Issues to check out

“Language Fashions are Unsupervised Multitask Learners”

Summarization

Query answering

Translation

Conclusion

The inspiration for a ruled agent workforce: DataRobot and NVIDIA RTX PRO 4500

SynthID: What it’s and The way it Works

Thoughts-altering substances are (nonetheless) falling brief in medical trials

LEAVE A REPLY Cancel reply

Most Popular

5 Causes AI-Powered EAs Outperform Conventional Automated Methods in 2026 – Analytics & Forecasts – 21 March 2026

Sam Bankman Fried’s previous political money offers AI PAC gasoline for going after NY state lawmaker Bores

Does Binohash grinding behave like a PoW, and is it really ASIC-resistant?

Playnance’s G Coin turns launch week right into a real-time development check

Recent Comments

ABOUT US

POPULAR POSTS

5 Causes AI-Powered EAs Outperform Conventional Automated Methods in 2026 – Analytics & Forecasts – 21 March 2026

Sam Bankman Fried’s previous political money offers AI PAC gasoline for going after NY state lawmaker Bores

Does Binohash grinding behave like a PoW, and is it really ASIC-resistant?

POPULAR CATEGORY