Snowflake releases a flagship generative AI model of its own
All-around, highly generalizable generative AI models were the name of the game once, and they arguably still are. But increasingly, as cloud vendors large and small join the generative AI fray, weâre seeing a new crop of models focused on the deepest-pocketed potential customers: the enterprise.
Case in point: Snowflake, the cloud computing company, today unveiled Arctic LLM, a generative AI model thatâs described as âenterprise-grade.â Available under an Apache 2.0 license, Arctic LLM is optimized for âenterprise workloads,â including generating database code, Snowflake says, and is free for research and commercial use.
âI think this is going to be the foundation thatâs going to let us â Snowflake â and our customers build enterprise-grade products and actually begin to realize the promise and value of AI,â CEO Sridhar Ramaswamy said in press briefing. âYou should think of this very much as our first, but big, step in the world of generative AI, with lots more to come.â
An enterprise model
My colleague Devin Coldewey recently wrote about how thereâs no end in sight to the onslaught of generative AI models. I recommend you read his piece, but the gist is: Models are an easy way for vendors to drum up excitement for their R&D and they also serve as a funnel to their product ecosystems (e.g., model hosting, fine-tuning and so on).
Arctic LLM is no different. Snowflakeâs flagship model in a family of generative AI models called Arctic, Arctic LLM â which took around three months, 1,000 GPUs and $2 million to train â arrives on the heels of Databricksâ DBRX, a generative AI model also marketed as optimized for the enterprise space.
Snowflake draws a direct comparison between Arctic LLM and DBRX in its press materials, saying Arctic LLM outperforms DBRX on the two tasks of coding (Snowflake didnât specify which programming languages) and SQL generation. The company said Arctic LLM is also better at those tasks than Metaâs Llama 2 70B (but not the more recent Llama 3 70B) and Mistralâs Mixtral-8x7B.
Snowflake also claims that Arctic LLM achieves âleading performanceâ on a popular general language understanding benchmark, MMLU. Iâll note, though, that while MMLU purports to evaluate generative modelsâ ability to reason through logic problems, it includes tests that can be solved through rote memorization, so take that bullet point with a grain of salt.
âArctic LLM addresses specific needs within the enterprise sector,â Baris Gultekin, head of AI at Snowflake, told TechCrunch in an interview, âdiverging from generic AI applications like composing poetry to focus on enterprise-oriented challenges, such as developing SQL co-pilots and high-quality chatbots.â
Arctic LLM, like DBRX and Googleâs top-performing generative model of the moment, Gemini 1.5 Pro, is a mixture of experts (MoE) architecture. MoE architectures basically break down data processing tasks into subtasks and then delegate them to smaller, specialized âexpertâ models. So, while Arctic LLM contains 480 billion parameters, it only activates 17 billion at a time â enough to drive the 128 separate expert models. (Parameters essentially define the skill of an AI model on a problem, like analyzing and generating text.)
Snowflake claims that this efficient design enabled it to train Arctic LLM on open public web data sets (including RefinedWeb, C4, RedPajama and StarCoder) at âroughly one-eighth the cost of similar models.â
Running everywhere
Snowflake is providing resources like coding templates and a list of training sources alongside Arctic LLM to guide users through the process of getting the model up and running and fine-tuning it for particular use cases. But, recognizing that those are likely to be costly and complex undertakings for most developers (fine-tuning or running Arctic LLM requires around eight GPUs), Snowflakeâs also pledging to make Arctic LLM available across a range of hosts, including Hugging Face, Microsoft Azure, Together AIâs model-hosting service, and enterprise generative AI platform Lamini.
Hereâs the rub, though: Arctic LLM will be available first on Cortex, Snowflakeâs platform for building AI- and machine learning-powered apps and services. The companyâs unsurprisingly pitching it as the preferred way to run Arctic LLM with âsecurity,â âgovernanceâ and scalability.
âOur dream here is, within a year, to have an API that our customers can use so that business users can directly talk to data,â Ramaswamy said. âIt wouldâve been easy for us to say, âOh, weâll just wait for some open source model and weâll use it. Instead, weâre making a foundational investment because we think [itâs] going to unlock more value for our customers.â
So Iâm left wondering: Whoâs Arctic LLM really for besides Snowflake customers?
In a landscape full of âopenâ generative models that can be fine-tuned for practically any purpose, Arctic LLM doesnât stand out in any obvious way. Its architecture might bring efficiency gains over some of the other options out there. But Iâm not convinced that theyâll be dramatic enough to sway enterprises away from the countless other well-known and -supported, business-friendly generative models (e.g. GPT-4).
Thereâs also a point in Arctic LLMâs disfavor to consider: its relatively small context.
In generative AI, context window refers to input data (e.g. text) that a model considers before generating output (e.g. more text). Models with small context windows are prone to forgetting the content of even very recent conversations, while models with larger contexts typically avoid this pitfall.
Arctic LLMâs context is between ~8,000 and ~24,000 words, dependent on the fine-tuning method â far below that of models like Anthropicâs Claude 3 Opus and Googleâs Gemini 1.5 Pro.
Snowflake doesnât mention it in the marketing, but Arctic LLM almost certainly suffers from the same limitations and shortcomings as other generative AI models â namely, hallucinations (i.e. confidently answering requests incorrectly). Thatâs because Arctic LLM, along with every other generative AI model in existence, is a statistical probability machine â one that, again, has a small context window. It guesses based on vast amounts of examples which data makes the most âsenseâ to place where (e.g. the word âgoâ before âthe marketâ in the sentence âI go to the marketâ). Itâll inevitably guess wrong â and thatâs a âhallucination.â
As Devin writes in his piece, until the next major technical breakthrough, incremental improvements are all we have to look forward to in the generative AI domain. That wonât stop vendors like Snowflake from championing them as great achievements, though, and marketing them for all theyâre worth.