Radar Trends to Watch: August 2024 – O’Reilly


July was a big month for model releases: There are new large models from Mistral and Meta, smaller multilingual models from Mistral and DeepL, another Mistral model that specializes in code generation, and a small version of GPT-4o. The security world saw another software supply chain disaster when CrowdStrike released a bad software update that disabled many Windows machines worldwide. While CrowdStrike’s release wasn’t “hostile,” strictly speaking, it demonstrates that there’s no real difference between a hostile attack or a bug that disables your IT infrastructure. We’re also seeing a surge in malware traffic, along with bogus vulnerability reports in CVE.

Artificial Intelligence

  • Google’s AlphaProof and Alpha Geometry solved four of six Math Olympiad problems, a performance that would have earned a silver medal in an actual competition. This is by far the best that an AI has ever achieved. However, it was significantly slower than humans.
  • Mistral has released Mistral Large 2, a 123 billion parameter model that (like other models) claims performance similar to GPT-4o. It is particularly strong at code generation. Mistral also highlights its multilingual capabilities. Large 2 is available on Hugging Face.
  • Facebook/Meta has released Llama 3.1, a 405 billion parameter model that claims performance superior to GPT-4 and Claude 3.5 Sonnet (at least on benchmarks). It is semi-open: source code and weights are available, but not training data, and there are restrictions on its use.
  • Google has developed new techniques for predicting weather that combine AI and traditional physical modeling. The new model yields more accurate long-term predictions and reduces energy consumption.
  • It’s a good day for releasing models. Mistral’s NeMo is a small open source multilingual language model. It has a large (128K) context window and performs well on English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi.
  • GPT-4o Mini, a small version of OpenAI’s flagship GPT-4o, is now available. Mini’s performance beats GPT-3.5 Turbo and is much less expensive per token. OpenAI also claims that GPT is resistant to jailbreaks and prompt injection. Security experts disagree.
  • DeepL’s latest large language model, which is trained to specialize in translation, outperforms Google Translate and GPT-4 for translation tasks.
  • Mistral has released Codestral Mamba, a new model for code generation that uses the new Mamba architecture rather than Transformers. Mamba is significantly faster than Transformers and scales linearly with the size of the input.
  • RTNet, a new kind of neural network, appears to make decisions the way a human would.
  • Andrej Karpathy reproduces GPT-2 (the full, 1.6B parameter model) in 24 hours for under $700.
  • A startup called Textgain has built a language model that detects hate speech in all 24 languages of the European Union.
  • Maggie Appleton makes an excellent argument about the role of AI in enabling “barefoot developers”: Non-professional programmers who solve real and important problems that aren’t at the scale needed to interest the software industry.
  • Microsoft has released GraphRAG on GitHub. GraphRAG is a set of tools for retrieval-augmented generation (RAG) that uses graph technology rather than vector embeddings to store and retrieve documents.
  • With appropriate prompting, large language models are able to detect deep fake images almost as well as custom software. LLMs can also say why they believe an image is a fake.
  • Figma, the collaborative online design tool, has introduced AI for designers. The tools are for searching for ideas, exploring different directions, and automating repetitive tasks. These features are currently in beta and are free to all users until the end of the year.
  • Toys “R” Us has created a commercial that was largely generated by SORA, OpenAI’s video-generation AI.
  • Claude Projects adds to Anthropic’s capabilities. It allows you to upload documents and other data that are shared across all chats associated with the project. You can share projects with other people on your team. (Team and Pro plans only.)
  • Is this the end of the GPU? Researchers have developed a way to train language models without matrix multiplication (MatMul), thus requiring much less power. Their models also require less memory and perform similarly to models trained with MatMul.

Programming

  • Inrupt, a company that is commercializing software building on the open Solid protocol, has announced a data wallet for securely storing and sharing personal data.
  • The Unix Pipe Card Game should have existed a long time ago!
  • eBPF, which will soon be supported by Windows, provides a secure kernel execution facility. If it had been available, it would have prevented the CrowdStrike crashes.
  • PythonMonkey enables Python programs to run JavaScript code, and vice versa. It also gives Python the ability to execute WebAssembly (Wasm) modules.
  • 1JPM (1 Java Project Manager) presents a different approach to build management. It’s a single file of Java source code, which you edit to reflect your project’s dependencies and other customizations. It’s an interesting alternative to the widely used and hated Maven.
  • An academic paper discusses design patterns for low-latency applications in C++. While it focuses on high-frequency trading, the ideas in this paper are no doubt useful for many kinds of applications.
  • The Principles Wiki is a great source of information and discussion about software design principles. It appears to be new; help it grow!
  • Julia Evans (@b0̷rk) gives some good reminders of why shell job control is useful—not the least of which is terminating a program that doesn’t respond to CTRL-C.
  • Marimo is a Python notebook that runs entirely in the browser using Wasm and Pyodide. Notebook elements, including user interface elements, run automatically whenever you modify or interface with them.

Security

  • The principle of least privilege in access control is crucial—but in practice, it is rarely implemented well. Can AI do a better job of determining who should access what and when?
  • A bad upgrade from CrowdStrike caused many Windows systems to crash, causing serious service interruptions for airlines, hospitals, and other organizations. Supply chain security isn’t just about open source; commercial vendors are a problem too.
  • Cloudflare’s 2024 update to its application security report states that it’s seeing a substantial uptick in malicious traffic, which is now roughly 7% of all traffic. Bot traffic is a major contributor.
  • An analysis of a software supply chain attack shows how malicious code is hidden in apparently normal images. The engineering in these attacks is increasingly sophisticated.
  • Blast-RADIUS is a new man-in-the-middle attack against the widely used RADIUS protocol for authentication, authorization, and accounting. Among other things, RADIUS is used for authentication by VPNs, ISPs, and Wi-Fi.
  • Ente Auth is an open source authenticator that provides 2FA, encrypted cloud backups, and cross-platform synchronization. Its cryptography has been externally audited.
  • A newly discovered vulnerability in OpenSSH allows unauthenticated remote code execution. If you aren’t keeping up to date on patches, it’s time to start.
  • The CVE system, which reports and archives security vulnerabilities, has increasingly been used for bogus vulnerability reports. Some of these are good-faith errors, but an increasing number comes from bounty hunters and others trying to enrich their résumés.
  • Hijackable hyperlinks are a problem. These links have misspelled URLs, placeholder URLs for sites that don’t exist yet, and more. These errors frequently aren’t fixed before the site goes live. Anyone discovering these links can register their domain name and build a hostile site.
  • SnailLoad is a surprising attack against online privacy. After a user downloads the malware—which does nothing overtly hostile—SnailLoad monitors internet latency. Small variations in latency are used as signatures for detecting what media the user is using.

Web

  • Google is abandoning its plan to eliminate third party cookie support in Chrome. Instead, there will be user-settable controls for cookie use. While privacy advocates object to abandoning the plan to eliminate cookies, it’s only fair to report that privacy advocates have also objected to Google’s proposed alternatives.
  • The Hall of Shame has a catalog of dark patterns that web designers use to deceive or manipulate users. Whether you’re a web developer or a user, it’s a good idea to familiarize yourself with the kinds of abuses that are out there.
  • WebVM is a virtual Linux emulation running in the browser. It’s based on an x86 emulation layer written in WebAssembly.
  • Transfer Thought is an open source platform for developing WebXR (VR, AR, any other kind of R) experiences.
  • The Ladybird Browser project is getting a lot of attention. It’s an attempt to build a standards-compliant web browser completely from scratch, without relying on code from Google or other vendors. An alpha version isn’t expected until 2026.
  • Moonbit is the second new language designed specifically to target WebAssembly. It is inspired by Rust, but designed to be a good match for Wasm’s semantics.

Quantum Computing

  • PsiQuantum, a quantum computing startup, is planning to build a million-qubit quantum computer within 10 years. Unlike other quantum teams, which have focused on building small systems, PsiQuantum is jumping directly to a computer that is capable of useful work.
  • It’s not a personal quantum computer, but the Quokka is a personal quantum computer emulator with 30 fault-tolerant qubits. It’s a platform for learning how to program useful quantum computers before we get the real thing.

Robotics

  • A robotic dog with vacuum cleaners in its feet can be used to clean beaches.
  • Training humanoid robots to dance may make them better at working with humans. They become better able to learn new movements and gestures.
  • Researchers are working on robots that learn by listening. Although audio provides important clues for many tasks that robots are asked to perform, it is rarely used as a source of training data.

Hardware

  • Tenstorrent has developed a new set of AI chips that are much less expensive than NVIDIA’s. They are available as PCIe cards or as components of complete workstations.


Learn faster. Dig deeper. See farther.