No results found. Try a different search term.
Top AI Models & Inference APIs on API.market
Browse our collection of high-quality AI Models & Inference APIs
π Unlock cutting-edge AI with models from OpenAI, Claude, Google Gemini, and Meta Llama. Build transformative experiences with the best in AI! β¨
GPT-5 Nano is OpenAI fastest, cheapest version of GPT-5. It's great for summarization and classification tasks
GPT-5.2 delivers exceptional coding and agentic task automation across industries with superior performance.
High-Speed, Low-Cost AI API with Extensive Language Model Support for Apps
GPT-4o (GPT-4 Omni) is the most advanced multimodal model (accepting text or image inputs and outputting text)
High Availability and Unlimited Calls for GPT 3.5 Turbo. We provide users with high-quality services
OpenAI-compatible API with 60+ models: Claude Opus/Sonnet, GPT-5, Gemini. Streaming, Vision, Tool Use. Multi-provider failover.
Unify is your centralized platform for LLM endpoints.
π Unlock cutting-edge AI with models from OpenAI, Claude, Google Gemini, and Meta Llama. Build transformative experiences with the best in AI! β¨
Access GPT-5, GPT-4.1, GPT-4o models directly with high availability and enjoy a 50% discount!
Experience low-latency chat completions with GLM-4, Qwen-Turbo, and DeepSeek at unmatched performance and cost savings.
Rapidly improve AI output accuracy by leveraging the most complete LLM Hallucination Taxonomy, Benchmark Scores, and Detection Methods.
GPT-4.1 nano excels at instruction following and tool calling. It features a 1M token context window, and low latency without a reasoning step.
GPT-4.1 excels at instruction following and tool calling, with broad knowledge across domains. It features a 1M token context window, and low latency without a
GPT-5 is OpenAI flagship model for coding, reasoning, and agentic tasks across domains
GPT-5 mini is a faster, more cost-efficient version of GPT-5. It's great for well-defined tasks and precise prompts.
GPT-4.1 mini excels at instruction following and tool calling. It features a 1M token context window, and low latency without a reasoning step.
Generate high-quality images from text or image references quickly with impressive performance and flat pricing.
β‘50% Discount | Direct and highly available API for all latest Gemini models: Gemini 3 Series, Gemini 2.5 Series, and more
Generate photorealistic images from text or reference images using Google's Gemini 3 Pro Image Preview.
Google's fastest Gemini model. Ultra-low latency, multimodal, China direct connection.
Iteratively create and edit images with text, boasting impressive speed and multi-turn editing.
ByteDance's ultra-low cost, high-concurrency model. Multimodal, China direct connection.
Access the latest DeepSeek V4 Flash model. Fast, affordable, China direct connection, OpenAI compatible.
Get seamless access to all Claude models (Claude-4, Claude-3.5 and more) with our high-performance, cost-effective API.