Introducing more enterprise-grade features for API customers

To help organizations scale their use of AI without overstretching their budgets, we've added two new ways to reduce costs on consistent and asynchronous workloads:

Discounted use of approved bandwidth: Customers with a persistent Token Per Minute (TPM) usage level on GPT-4 or GPT-4 Turbo can request access to assured bandwidth to receive discounts ranging from 10–50% based on commitment size.
Reduced costs on asynchronous workloads: Customers can use our new ones Batch API for asynchronous starting of non-urgent workloads. Bulk API request pricing is 50% lower than shared pricing, offers much higher rate limits, and returns results within 24 hours. This is ideal for use cases such as model evaluation, offline classification, summarization and synthetic data generation.

We plan to continue adding new features focused on enterprise-level security, administrative controls, and cost management. For more information on these launches, visit our API documentation or get in touch with our team to discuss customized solutions for your business.

Source link

Introducing more enterprise-grade features for API customers

Leave a Reply Cancel reply

Podcasts