Is it allowed to use OpenAI API outputs to create a silver code dataset or benchmark for a specific Python library? [d]
Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.
Hello everyone,
Is it allowed to use OpenAI API outputs to create a silver code dataset or benchmark for a specific Python library?
I am working on a project idea related to library-specific code generation. The concrete case is a specific Python library used in a technical/scientific domain. The goal would be to improve and evaluate how well code-generation models can use this library correctly.
I am trying to understand the legal / Terms of Service boundary around using OpenAI API outputs in two different scenarios:
Scenario 1: Silver dataset for fine-tuning an OSS model
Use the OpenAI API to generate programming tasks, reference solutions, and verification tests for the specific Python library.
Then human-review, filter, and validate the generated examples. Then use this silver dataset to fine-tune an open-source code model, with the goal of improving its performance on this specific library.
My question: would this violate OpenAI’s terms because the API outputs are being used to train/fine-tune another coding model, even if the scope is narrow and library-specific?
Scenario 2: Benchmark only, not training
Use the OpenAI API to generate programming tasks, reference solutions, and verification tests.
Human-review and validate them. Then use the resulting dataset only as an evaluation benchmark to compare different models. The benchmark would not be used to fine-tune or train any model.
My question: is this generally considered allowed under OpenAI’s terms, assuming the benchmark is properly reviewed and documented as AI-assisted?
I understand that Reddit is not legal advice, and I would still contact OpenAI or legal counsel for a definitive answer. However, I thought new ideas could come up from people who have already faced similar situations in practice.
[link] [comments]
More from r/MachineLearning
-
Scrap the LLMs. Scoring 4.76% on the brand new ARC-3 using pure code, a 2012 AMD CPU, and zero AI tokens.[P]
Jun 5
-
[R] Measuring the Symmetry--Data Exchange Rate
Jun 4
-
How do ML researchers actually use AI tools to improve their writing? [D]
Jun 4
-
We built a source-available LLM reliability library (free for research / personal / internal eval) that can cut inference cost by half at matched quality, and you adopt it by changing one import [P] [R]
Jun 4
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.