Understanding Codex training data and outputs

in hive-172186 •  2 years ago 

OpenAI cares deeply about developers and is committed to respecting their rights. Our hope is that Codex will lower barriers to entry and increase opportunities for beginner programmers, make expert programmers more productive, and create new code-generation tools.

The Codex model was trained on tens of millions of public repositories, which were used as training data for research purposes in the design of Codex. We believe that is an instance of transformative fair use.

The source material from those public repositories is intended to be used for these research and training purposes only; it is not intended to be included verbatim in Codex outputs. Analysis has shown that, even in this early stage of development, the vast majority of output (>99%) does not match training data. Of course, certain source material, like all computer programs, contains common, widely-used solutions that are either standard and/or functionally-mandated.

During this early, developmental stage of Codex, we continue to refine the product in numerous ways. We welcome feedback from developers, including any questions or concerns they may have about the generated output during our free beta period.

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  
Loading...
Hello friend...

The newcomer community is a special community for new steemit users, you can only release achievement 1 to achievement 6 posts and compilation tasks in this community

I invite you to participate in the Newcomer Achievement Program. As a first step to complete the Achievement 1 task you can follow this link: Achievement 1: Verification Through Introduction

Note

Anyone is prohibited from publishing public posts on the Newcomer Community other than the Achievement Task, as a penalty we will mute your posts

I appreciate your cooperation
Regards