#gpt-2

[ follow ]
fromHackernoon
1 year ago

Empirical Results: GPT-2 Analysis of Transformer Memorization & Loss | HackerNoon

These experiments with GPT-2 medium on OpenWebText validate the radius hypothesis from our theoretical framework, measuring activation distances in the last layer for next-token prediction.
Roam Research
Artificial intelligence
fromHackernoon
1 year ago

GPT-2 Study Shows How Language Models Can Amplify Political Bias | HackerNoon

The study emphasizes the importance of addressing bias amplification in large language models, particularly in the context of political bias in media.
Bootstrapping
fromHackernoon
10 months ago

How Tokenizer Choices Shape Hidden Risks in Popular Language Models | HackerNoon

Tokenization methods can reveal under-trained tokens that impact model performance.
Identifying and addressing under-trained tokens is crucial for improving language model accuracy.
[ Load more ]