Will the non-English genAI problem lead to data transparency and lower costs?
Briefly

The article discusses the opacity surrounding data training for AI models, particularly focusing on the contrasting approaches of OpenAI and DeepSeek. It notes the clever use of open source by DeepSeek to lower its model costs, while raising concerns about the hidden influence of the Chinese government in its funding. The author expresses support for potential price reductions in generative AI driven by competition from DeepSeek but emphasizes that until these price cuts are realized, the secrecy in training data practices for non-English models is problematic for IT executives.
There are a variety of reasons why model makers don't disclose their data training particulars. The case of OpenAI and DeepSeek highlights data transparency issues.
DeepSeek's lower costs reflect efficiencies gained through open source, but there is little transparency about potential Chinese government funding.
If DeepSeek can exert downward pressure on genAI pricing, it could significantly benefit IT executives, provided they see real price reductions.
Until we observe substantial price cuts in model maker pricetags, the lack of transparency in data training for non-English models remains a concerning issue.
Read at Computerworld
[
|
]