Einloggen, um das vollständige Profil von Gary Illyes zu sehen
Schön, dass Sie wieder da sind
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
oder
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
Zürich, Zürich, Schweiz
Kontaktinformationen
Einloggen, um das vollständige Profil von Gary Illyes zu sehen
Schön, dass Sie wieder da sind
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
oder
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
10.525 Follower:innen
89 Kontakte
Einloggen, um das vollständige Profil von Gary Illyes zu sehen
Schön, dass Sie wieder da sind
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
oder
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
Gemeinsame Kontakte mit Gary Illyes anzeigen
Schön, dass Sie wieder da sind
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
oder
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
Gemeinsame Kontakte mit Gary Illyes anzeigen
Schön, dass Sie wieder da sind
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
oder
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
Einloggen, um das vollständige Profil von Gary Illyes zu sehen
Schön, dass Sie wieder da sind
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
oder
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
Aktivitäten
Einloggen, um das vollständige Profil von Gary Illyes zu sehen
Schön, dass Sie wieder da sind
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
oder
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
-
One of the biggest challenges every AI company will face: Trust. For decades, we've gotten used to determinism in software. Since LLMs are…
One of the biggest challenges every AI company will face: Trust. For decades, we've gotten used to determinism in software. Since LLMs are…
Beliebt bei Gary Illyes
-
The internet is built on standards -- some that were created 30 years ago, some that are just a few years old --, that allow interoperability of the…
The internet is built on standards -- some that were created 30 years ago, some that are just a few years old --, that allow interoperability of the…
Geteilt von Gary Illyes
-
The IAB is calling for papers about controlling AI crawlers for a potential invite to a two day IAB workshop on the same topic in DC. If that's…
The IAB is calling for papers about controlling AI crawlers for a potential invite to a two day IAB workshop on the same topic in DC. If that's…
Geteilt von Gary Illyes
Berufserfahrung und Ausbildung
-
Google
*******
Gesamte Berufserfahrung von Gary Illyes anzeigen
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Schön, dass Sie wieder da sind
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
Weitere Aktivitäten von Gary Illyes
-
Lava is probably not among the first things you associate Switzerland with, yet we have torrential lava in the Canton of Valais. Allegedly. "Lave…
Lava is probably not among the first things you associate Switzerland with, yet we have torrential lava in the Canton of Valais. Allegedly. "Lave…
Geteilt von Gary Illyes
-
Last post this week about robots.txt for its 30th birthday, let's have some fun. Fact from a few days ago: you can have invalid lines in the…
Last post this week about robots.txt for its 30th birthday, let's have some fun. Fact from a few days ago: you can have invalid lines in the…
Veröffentlicht von Gary Illyes
Gary Illyes’ vollständiges Profil ansehen
Sign in
Stay updated on your professional world
Wenn Sie auf „Weiter“ klicken, um Mitglied zu werden oder sich einzuloggen, stimmen Sie der Nutzervereinbarung, der Datenschutzrichtlinie und der Cookie-Richtlinie von LinkedIn zu.
Neu bei LinkedIn? Mitglied werden
Weitere ähnliche Profile
-
John Mueller
ZürichVernetzen -
Danny Sullivan
Newport Beach, CAVernetzen -
Martin Splitt
Zürich, SchweizVernetzen -
Barry Schwartz
Suffern, NYVernetzen -
Daniel Waisberg
ZürichVernetzen -
Lily Ray
New York, NYVernetzen -
Matt Cutts
Washington, DCVernetzen -
Aleyda Solís
SpanienVernetzen -
Neil Patel
Los Angeles, CAVernetzen -
Eli Schwartz
Metropolregion San Francisco Bay AreaVernetzen -
Marie Haynes
Ottawa, ONVernetzen -
Glenn Gabe
Pennington, NJVernetzen -
Louis Smith
Region Manchester, Vereinigtes KönigreichVernetzen -
Cherry Sireetorn Prommawin
Search Quality Analyst /Webmaster Relationships at Google Asia Pacific
SingapurVernetzen -
Daniel Foley Carter
LondonVernetzen -
Rand Fishkin
Seattle, WAVernetzen -
Liam Fallen
Region London, Vereinigtes KönigreichVernetzen -
Mark Williams-Cook
Großraum Norwich, Vereinigtes KönigreichVernetzen -
Brian Dean
Boston, MAVernetzen -
Fery Kaszoni
Oxford und UmgebungVernetzen
Weitere Beiträge entdecken
-
Data Science Dojo
Open Source LLMs are like a community cookbook. Their code and underlying mechanisms are freely available for anyone to see, modify, and experiment with. Closed Source LLMs, on the other hand, are like exclusive restaurant recipes. Their inner workings are proprietary and not readily accessible to the public. Become an LLM master! Enroll in our exclusive 5-day LLM Bootcamp (online & in-person) ➡️ https://hubs.la/Q02GspfH0 #LLMs #LargeLanguageModel #DataScience
16
-
Prof Bill Buchanan OBE FRSE
In many places I see ChaCha20 replacing AES for symmetric key encryption. But is the 64-bit nonce value too small? Well, there's always XChaCha20, and which has a 192-bit nonce value. With symmetric key, Bob and Alice have the same key. NSec.cryptography uses the XChaCha20 method, and which supports stream encryption (and which does not require padding as a block cipher does, and is also faster than block cipher modes). ChaCha20 was created by Daniel J. Bernstein, and has an eight byte or 16 byte nonce. XChaCha20 (eXtended-nonce ChaCha) is an update to ChaCha20, and uses a 24 byte nonce. It has a lower propability of nonce misuse than ChaCha20. The cipher text is made up of the cipher message (and which is the same length of the plaintext message) is the same number of bytes as the message (five bytes), and that the cipher text has an extra 16 bytes (used for AEAD - Authenticated Encryption with Associated Data). The MAC bytes used Poly1305 and provide an integrity check for the cipher. Try here: https://lnkd.in/eDhK7w2n
40
8 Kommentare -
AI Nexus
Apple recently unveiled an Open Source Efficient LLM (OpenELM), a ground breaking project aimed at making LLMs more accessible. Led by Sachin Mehta, Mohammad Hossein Sekhavat, and Qingqing Cao, OpenELM offers a family of smaller LLMs. What sets OpenELM apart is its focus on efficiency and accessibility. The model is designed to be lightweight & efficient enough to run on a smartphone, and with parameter counts ranging from 270 million to 3 billion, OpenELM models can process 2,048 tokens of context, providing users with powerful language capabilities on their mobile devices. The authors pretrained OpenELM on 1.8 trillion tokens, drawing from subsets of publicly available text datasets. Results have shown that OpenELM outperforms other open-source models trained solely on publicly available data. For example, a 1.08 billion parameter OpenELM surpassed a similar model (1.18 billion parameter OLMo), achieving superior performance across 5 tasks. Apple’s commitment to user privacy is achieved by allowing the models to run entirely on the device, Apple ensures that user activity remains secure and private. In a time when language technology is rapidly advancing, initiatives like OpenELM pave the way for more accessible and efficient language models. Follow AI Nexus - Club for more… #ai #apple
3
-
SMX Advanced Europe
Kabeer Singh is a computer science engineer who turned into an #SEO. And he does know about large-scale marketplaces as he worked with brands like Rocket Internet, Expedia, Qiagen and (eBay) kleinanzeigen. In this session, he will take a deep dive into how we can scale up / down relevant pages (in the order of 10M+ / 100M+) based on automations and thresholds. We’ll look at maximizing returns from faceted navigation, internal search as a goldmine, demand and supply balance and dynamic (internal) link architecture (involving search, category, product pages). https://ow.ly/H5kx50RTRSG #smxadvanced #WebOptimization
3
-
Josh Rickel
Not surprising but more signs of on device LLMs coming from Apple. The critical component to success will be how this is implemented across native applications, how third party applications can leverage them on device, how much Siri will be involved (probably heavily) and how it will maintain recency to keep the experience as current as possible. We could see new models deployed like software updates on our devices and multiple smaller, fine tuned models for specific purposes on device. #Apple #AI
9
-
Ali Arsanjani, PhD
These findings motivate our design of a novel framework to investigate hidden representations in LLMs with LLMs, which we call Patchscopes. The key idea behind this framework is to use LLMs to provide natural language explanations of their own internal hidden representations. Patchscopes unifies and extends a broad range of existing interpretability techniques, and it enables answering questions that were difficult or impossible before. For example, it offers insights into how an LLM's hidden representations capture nuances of meaning in the model's input, making it easier to fix certain types of reasoning errors. While we initially focus the application of Patchscopes to the natural language domain and the autoregressive Transformer model family, its potential applications are broader. For example, we are excited about its applications to detection and correction of model hallucinations, the exploration of multimodal (image and text) representations, and the investigation of how models build their predictions in more complex scenarios. Google DeepMind
8
-
Bolsters Advisory
San Francisco-based startup 𝐍𝐮𝐜𝐥𝐞𝐮𝐬𝐀𝐈 has recently unveiled its 22-billion-parameter large language model (LLM) after emerging from stealth. 💡 𝐓𝐡𝐞 𝐏𝐨𝐰𝐞𝐫 𝐨𝐟 𝐂𝐨𝐥𝐥𝐚𝐛𝐨𝐫𝐚𝐭𝐢𝐨𝐧 & 𝐄𝐱𝐩𝐞𝐫𝐭𝐢𝐬𝐞 This open-source LLM offers versatility between 13B and 34B segments and boasts a unique training approach using a trillion tokens of diverse data. The model encompasses a wide knowledge base, spanning general information to research and coding insights. 🌾 𝐀𝐠𝐫𝐢𝐜𝐮𝐥𝐭𝐮𝐫𝐞 𝐑𝐞𝐯𝐨𝐥𝐮𝐭𝐢𝐨𝐧𝐢𝐳𝐞𝐝? NucleusAI's vision extends beyond AI as they pioneer an intelligent operating system for agriculture. It aims to optimize supply and demand, akin to what Uber did for taxis. 𝐍𝐮𝐜𝐥𝐞𝐮𝐬𝐀𝐈'𝐬 𝐯𝐞𝐧𝐭𝐮𝐫𝐞 𝐢𝐧𝐭𝐨 𝐚𝐠𝐫𝐢𝐜𝐮𝐥𝐭𝐮𝐫𝐞 𝐮𝐧𝐝𝐞𝐫𝐬𝐜𝐨𝐫𝐞𝐬 𝐭𝐡𝐞 𝐩𝐨𝐭𝐞𝐧𝐭𝐢𝐚𝐥 𝐚𝐧𝐝 𝐢𝐦𝐩𝐚𝐜𝐭 𝐨𝐟 𝐀𝐈 𝐚𝐜𝐫𝐨𝐬𝐬 𝐝𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐭 𝐢𝐧𝐝𝐮𝐬𝐭𝐫𝐢𝐞𝐬. #VentureCapital #AI #AIInnovation #AgricultureTech #StealthMode #EmergingTech #NucleusAI #IntelligentFarming #TechInvesting https://lnkd.in/dMtg27qZ
2
-
Keep-Current
A new LLM - but of a different kind - was released today in Austria: xLSTM. Most of the LLMs are using the transformer architecture. The LSTM method ruled the NLP world up to ~2018 and then was neglected for a while after BERT appeared. LSTM had its issues - it was difficult to handle long sequences (Transformers create a matrix of relations between every word with the other words in the sequence). Hence, the question was - can LSTM still compete with the advances of Transformers and LLMs? Turns out it can. It's too early to conclude, but the number of parameters and training data amount is potentially more important than which architecture is used. In any case, this could open up future possibilities for new LLMs of different sizes. https://lnkd.in/gbfpXBYe
-
Soum Paul
There are three fundamental problems researchers face when building foundational LLM in any language. They are... 👇 1. GPU cost: Training an LLM from scratch is costly. Very costly. 2. Absent Revenue Model: Unless an Indic-language LLMs performance beats top existing models, it is tough to monetize -- and therefore, impossible to recover GPU costs. 3. Data: LLM training requires clean, accurate, high-quality data. This is currently absent for most Indian languages. It is possible to create it, but requires substantial work. Even if one solves the first two problems, the third one is a major challenge. Due to this, even fine-tuned versions of existing LLMs don't perform as well as the original model. Unless one is confident about cracking all three problems - it is not worth getting into. There are numerous (100s or 1000s) of other problem statements in the AI domain that one can take on instead.
19
1 Kommentar -
Mark Hinkle
Last November the news was on fire, with speculation that an OpenAI model called Q* was getting close to AGI. Today Adam Selipsky announced their new product Amazon Q, coincidence? I think not. This used to be a product called CodeWhisperer. But it looks like it's merged into a product suite under the name Q. The website for CodeWhisperer had an announcement listed: "Amazon CodeWhisperer is now Amazon Q Developer, which is generally available. All of the functionality that CodeWhisperer provides is part of Amazon Q Developer – in-line code suggestions, security scanning, open-source license tracking, and more. This website will remain live until mid May. For the latest information and to learn more about Amazon Q Developer, visit the website." Any how, it looks like the play is to connect to internal data sources and employees to make generative AI apps based by their own apps. They note in their FAQs "When you sign up for Amazon Q Developer Pro or Amazon Q Business plans, we don't use content to improve underlying models for others. Your data is your differentiator." But it doesn't look like there are separate instances of the model but rather a super-model (yeah I made that pun). So if you trust that your data won't get integrated than you should be fine. Just like the trainwreck you couldn't look away from on the news. 🙄 P.S. 👍 If you found these insights useful, please like this post to support more content like this! 🔁 Feel free to share with your network. P.P.S. Subscribe to my newsletter, The Artificially Intelligent Enterprise, for weekly insights into how to succeed with AI in Business - https://lnkd.in/eqJNArYx
211
17 Kommentare -
Legal Tech Blog
PwC Germany and Aleph Alpha Launch Founded Joint Venture creance.ai The auditing and consulting firm PwC Germany and Aleph Alpha, a leading generative AI developer in Europe, have announced the establishment of a joint venture. The aim is to jointly develop innovative solutions using generative artificial intelligence in the legal and compliance market. The joint venture will operate under the name creance.ai and its products will support companies in dealing with complex legal requirements. Read more here: https://lnkd.in/enpMA2_T #LegalTech #AI #JointVenture PwC Deutschland Aleph Alpha Björn Viebrock Jonas Andrulis
1
-
Miku Jha
🌟 𝗜𝘁𝗲𝗿𝗮𝘁𝗶𝘃𝗲 𝗦𝗲𝗹𝗳-𝗥𝗲𝗳𝗶𝗻𝗲𝗺𝗲𝗻𝘁: 𝗔 𝗛𝘂𝗺𝗮𝗻-𝗜𝗻𝘀𝗽𝗶𝗿𝗲𝗱 𝗔𝗽𝗽𝗿𝗼𝗮𝗰𝗵 𝘁𝗼 𝗘𝗻𝗵𝗮𝗻𝗰𝗶𝗻𝗴 𝗔𝗜 𝗢𝘂𝘁𝗽𝘂𝘁𝘀 Ever felt stuck rewriting an email to get it just right? 📝 Large language models (LLMs) experience similar challenges, struggling to perfect outputs on the first try. This hit home while laboriously tweaking LLM outputs—repetitively generating, questioning ("Is this tone executive-appropriate?"), and refining based on feedback. It was draining for both me and the LLM. Enter "𝗦𝗲𝗹𝗳-𝗥𝗲𝗳𝗶𝗻𝗲," a concept that has me utterly captivated. Why? Because it automates the very process I was wrestling with! Imagine an LLM that can not only generate text, but also analyze it critically and refine it based on its own assessment – all without a single human intervention. This is the magic of Self-Refine: a method that empowers LLMs to become their own editors, constantly striving to improve their work through a human-like iterative process. 🔄 🛠️ 𝗗𝗲𝗹𝘃𝗶𝗻𝗴 𝗶𝗻𝘁𝗼 𝘁𝗵𝗲 𝗧𝗲𝗰𝗵𝗻𝗶𝗰𝗮𝗹 𝗪𝗶𝘇𝗮𝗿𝗱𝗿𝘆 𝗼𝗳 𝗦𝗲𝗹𝗳-𝗥𝗲𝗳𝗶𝗻𝗲 The elegance of Self-Refine lies in its simplicity and effectiveness. Here's a breakdown of the core functionalities that make this human-like iterative process a reality: 🔍 𝗜𝗻𝗶𝘁𝗶𝗮𝗹 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻: The model receives an input and generates an initial response. This could be anything from crafting a creative marketing blurb to writing a line of code. 🎯 𝗦𝗲𝗹𝗳-𝗖𝗿𝗶𝘁𝗶𝗾𝘂𝗲: The model then analyzes its own output, identifying areas for improvement in terms of quality, accuracy, or adherence to specific criteria. Imagine the LLM dissecting its email draft and questioning the clarity of certain sentences or the overall tone. ✍️ 𝗦𝗲𝗹𝗳-𝗥𝗲𝗳𝗶𝗻𝗲𝗺𝗲𝗻𝘁: Leveraging the self-generated critique, the model refines the initial output to enhance its performance. This could involve rewriting sentences, adjusting the technical approach in the code, or tailoring the marketing message for a specific audience. 🔄 𝗜𝘁𝗲𝗿𝗮𝘁𝗶𝘃𝗲 𝗣𝗿𝗼𝗰𝗲𝘀𝘀: This cycle continues until the model determines no further improvement is possible or a predefined number of iterations is reached. Just like revising an email draft until you're satisfied, Self-Refine allows the LLM to iteratively refine its work until it achieves the desired outcome. 💡 𝗕𝗲𝗻𝗲𝗳𝗶𝘁𝘀 𝗮𝗻𝗱 𝗔𝗱𝘃𝗮𝗻𝘁𝗮𝗴𝗲𝘀 𝗼𝗳 𝘁𝗵𝗶𝘀 𝗦𝗲𝗹𝗳-𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗔𝗽𝗽𝗿𝗼𝗮𝗰𝗵 🚀 𝗥𝗲𝗱𝘂𝗰𝗲𝗱 𝗥𝗲𝗹𝗶𝗮𝗻𝗰𝗲 𝗼𝗻 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝗗𝗮𝘁𝗮 📈 𝗗𝗲𝗺𝗼𝗻𝘀𝘁𝗿𝗮𝘁𝗲𝗱 𝗣𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲 𝗚𝗮𝗶𝗻𝘀 🌐 𝗕𝗿𝗼𝗮𝗱 𝗔𝗽𝗽𝗹𝗶𝗰𝗮𝗯𝗶𝗹𝗶𝘁𝘆 As Agentic AI workflows continue to evolve, I believe Self-Refine has the potential to become a cornerstone technology, driving significant advancements in LLM output quality and ultimately contributing to the success of Agentic AI applications. For a deeper dive : https://lnkd.in/g_X3JCGA #AI #ArtificialIntelligence #GenerativeAI #vertexai
8
-
Tyler Neylon
What will LLMs look like in 10 years? Prediction: I think individual LLM modules will use similar transformer architecture with attention that's better at long context — and that they'll be _smaller_ than the largest LLMs we use today. I'll break that down. Prediction 1: I snuck in the use of the term "individual LLM modules" because we may continue using LLMs that are experts or specialists. Right now those are tightly coupled together, but we may find it useful to decouple them for speed (parallelism), memory use (lower GPU mem needed per module), and training (can optimize one module at a time). Prediction 2: Better at longer context. We're already seeing those changes as models tend to have longer context windows. So many new ideas have gone into this, and I'm paying attention because this is where the next capability shift will occur — models will effectively have unlimited context and, effectively, remember things from the arbitrary past. Prediction 3: Models will stay "small" as opposed to the best models getting ever larger, maybe based on a variant of Moore's law. A few ideas go into that last "staying small" prediction. One of them is a new paper looking into the effects of pruning _entire layers_ from LLMs. This sounds to me like chopping off chunks of a brain and seeing what happens. It's a grisly, horrifying action to take on an intricate, finely-tuned engine. But — egads! — it actually works. To be clear, by "it works," I mean that the authors found they can cut off up to ~30% of the _layers_ of an LLM, do a small amount of fine-tuning, and the model still performs at essentially the same level of quality as it did before the lobotomy — er, pruning. I can't think of a clearer sign that we can build great LLMs with fewer weights. Of course, some layers are more important than others. This week's Learn & Burn talks about which layers are most important, and how they figured that out: https://lnkd.in/g6jdhERT
12
2 Kommentare -
Alex Vesa
Ten years ago, having a Feature Store in my pocket was unimaginable 🤔. Everything was stored in CSV files, scattered across folders, and crammed into a single database. Back then, the concept of MLOps was unfamiliar to me, so everything was done manually. 🤣 Do you remember when you wanted to train a new model, you'd be waiting for the latest CSV file from the data science team? Or if you wanted to run inference, you'd be on hold until the ML team provided the updated .pt file? 𝐵𝑢𝑡 𝑡ℎ𝑜𝑠𝑒 𝑑𝑎𝑦𝑠 𝑎𝑟𝑒 𝑔𝑜𝑛𝑒... Today, 𝐅𝐞𝐚𝐭𝐮𝐫𝐞 𝐒𝐭𝐨𝐫𝐞𝐬 are a cornerstone of 𝐌𝐋𝐎𝐩𝐬, streamlining data management and model deployment. They provide a centralized repository for features, ensuring consistency, reusability, and faster model iteration. A feature store plays a critical role in machine learning workflows by: 𝐂𝐞𝐧𝐭𝐫𝐚𝐥𝐢𝐳𝐢𝐧𝐠 𝐃𝐚𝐭𝐚 𝐌𝐚𝐧𝐚𝐠𝐞𝐦𝐞𝐧𝐭: It centralizes feature data, making it accessible and reusable across multiple machine learning models and projects. 𝐄𝐧𝐬𝐮𝐫𝐢𝐧𝐠 𝐂𝐨𝐧𝐬𝐢𝐬𝐭𝐞𝐧𝐜𝐲: Consistent feature calculation ensures that the same data preprocessing steps are applied in both training and prediction phases, reducing errors. 𝐈𝐦𝐩𝐫𝐨𝐯𝐢𝐧𝐠 𝐄𝐟𝐟𝐢𝐜𝐢𝐞𝐧𝐜𝐲: By storing pre-computed features, it significantly speeds up the experimentation process, allowing for rapid testing of different models. 𝐒𝐜𝐚𝐥𝐢𝐧𝐠 𝐰𝐢𝐭𝐡 𝐄𝐚𝐬𝐞: As projects grow, a feature store can manage the scaling of data operations efficiently, supporting larger datasets and more complex feature engineering tasks. If you want to learn more about feature stores: You can read more here ↓↓↓ Features Store: https://lnkd.in/d9DV7UaJ 🔗 𝑇ℎ𝑒 𝑅𝑜𝑙𝑒 𝑜𝑓 𝐹𝑒𝑎𝑡𝑢𝑟𝑒 𝑆𝑡𝑜𝑟𝑒𝑠 𝑖𝑛 𝐹𝑖𝑛𝑒-𝑇𝑢𝑛𝑖𝑛𝑔 𝐿𝐿𝑀𝑠: https://lnkd.in/d6SfMtpc
21
5 Kommentare -
Andrei Lopatenko 🇺🇦
PMC-LLaMA: Towards Building Open-source Language Models for Medicine An year old paper, but a lot of interesting, practically significant details about how fine-tuning can *significantly* improve models for a specific domain. Many companies do fine-tune models for certain domains representing their business, and here are interesting insights how to achieve greater gains "Compared with baseline 7B LLaMA model, integrating biomedical papers brings a performance gain from 44.54% to 44.70% and 48.51% to 50.54% on MedQA and MedMCQA respectively. While after adding books for training, the performance is improved significantly, i.e., obtaining 1.02%, 2.94%, and 1.2% on MedQA, MedMCQA and PubMedQA respectively. Both observations have shown the importance of injecting fundamental medical knowledge." "Furthermore, integrating conversations with rationale QA for instruction tuning can produce substantial enhancements, with performance boosts from 49.32% to 54.43% on MedQA. This demonstrates the pivotal role played by the diversity of question types during the instruction tuning stage, as all involved questions will be limited on medical choice tests without conversation. In addition, the incorporation of a knowledge graph introduces a further improvement of 1.93% on the MedQA dataset, demonstrating the importance of using explicit instructions to emphasize the key medical concepts." and other details are inside https://lnkd.in/gCvTkJf7
6
-
Uzma Firoz Khan
As euphoric and curious minds join as interns this summer, a gush of nostalgia hits me, reminding me of when I joined Microsoft around the same time last year. Here's my piece of advice 1. It would likely be the first #software job for most #interns, and feeling unsure and anxious is commonplace. Acknowledge your feelings and direct them towards learning and progress. 2. #Microsoft believes in a #learn_it_all culture rather than a know-it-all one, which is valid for most other companies. Being an intern, you'll be judged by how quickly you learn rather than by how much you already know. 3. That brings us to our next point. It's needless trying to prove that you belong. You have what it takes; you don't have to prove anything to anyone. Give your best and believe in yourself. 4. You cannot prepare beforehand. Take that pressure off your shoulders. If you don't know something, figure out how to do it using #stackoverflow, #documentation, #Google Search, and other tools. There will always be new tech stacks to learn (or even in-house tech). Use online resources and documentation. 5. Ask questions about and outside the project/technology you're working on. Don't stay "stuck" for too long. Ask relevant questions and get help after you have done your research. 6. #Communication is the key. There are two aspects to it. Firstly, communicate well with your manager and mentor about what's expected of you by the end of the internship. It'll help you track the progress and ensure you are going in the right direction. Set up regular meetings with your manager and others you'll work closely with. 7. Secondly, internships are an excellent opportunity to #network and indulge in tech discussions with brilliant minds and experienced people. Don't limit your #internship to your desk. Interact with people not just from your team but also other teams. People are intriguing human novels with rich stories and experiences. Interacting with them broadens your thinking spectrum by giving you varied perspectives. 8. Regular #feedback from your manager is crucial. Managers are present to help you grow to your full potential. Keep taking input on what's going well and what's not. 9. Most importantly, keep a daily journal of the work you do. It will make it easy for you to track and update others on your progress. Jot down what you learned, the features you worked on, the discussions you had, and the code you wrote throughout your day. Talk about challenges you faced/are facing and how to plan to overcome them. 10. Network with your intern group and support them whenever possible. Seeing that you're not in it alone is reliving and motivating. Additionally, you'll find a group of people to sing and dance with at intern events ;) Finally, whether or not you get a #PPO, being an intern will be one of your best experiences. Enjoy, interact, #learn, and #grow as much as possible. Remember to make memories. You've got this!!
172
4 Kommentare -
Kalyan KS
𝐆𝐚𝐧𝐠𝐚-1𝐁 : 𝐎𝐩𝐞𝐧-𝐒𝐨𝐮𝐫𝐜𝐞 𝐇𝐢𝐧𝐝𝐢 𝐋𝐋𝐌 Ganga-1B is the open-source LLM for the Hindi language. Ganga-1B is pretrained from scratch, unlike other Indic LLMs which are adapted from other existing LLMs like Llama, Gemma etc. Ganga-1B outperforms all open-source LLMs with sizes till 7B. Excellent work done by Lingo Labs, CSE Department, Indian Institute of Technology Gandhinagar Note: As mentioned in the tweet, the model weights and the paper will be online available soon. #hindi #llms #generativeai #opensource #india #nlproc #deeplearning
276
13 Kommentare -
Jeremiah Harmsen
Want your models to better represent global perspectives? 🌍 Check out this wonderfully insightful and *actionable* research from the team in Google DeepMind Zurich (led by the fantastic student researcher Angéline Nathalie Pouget). 👇 Check it out "No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision–Language Models" https://lnkd.in/gnGc4DG3 💡Key findings: - "common filtering of training data to English image–text pairs disadvantages communities of lower socioeconomic status and negatively impacts cultural understanding" - "pretraining with global, unfiltered data before fine-tuning on English content can improve cultural understanding without sacrificing performance on ... popular benchmarks" - introduces "the task of geo-localization as a novel evaluation metric to assess cultural diversity in VLMs" 🧠 Authors Angéline Nathalie Pouget Lucas Beyer Emanuele Bugliarello Xiao Wang Andreas Steiner Xiaohua Zhai Ibrahim Alabdulmohsin
24