Last Update 6:48 AM February 17, 2025 (UTC)

Identity Blog Catcher

Brought to you by Identity Woman and Infominer.
Support this collaboration on Patreon!!!

Monday, 17. February 2025

Simon Willison

50 Years of Travel Tips

50 Years of Travel Tips These travel tips from Kevin Kelly are the best kind of advice because they're almost all both surprising but obviously good ideas. The first one instantly appeals to my love for Niche Museums, and helped me realize that traveling with someone who is passionate about something fits the same bill - the joy is in experiencing someone else's passion, no matter what the topi

50 Years of Travel Tips

These travel tips from Kevin Kelly are the best kind of advice because they're almost all both surprising but obviously good ideas.

The first one instantly appeals to my love for Niche Museums, and helped me realize that traveling with someone who is passionate about something fits the same bill - the joy is in experiencing someone else's passion, no matter what the topic:

Organize your travel around passions instead of destinations. An itinerary based on obscure cheeses, or naval history, or dinosaur digs, or jazz joints will lead to far more adventures, and memorable times than a grand tour of famous places. It doesn’t even have to be your passions; it could be a friend’s, family member’s, or even one you’ve read about. The point is to get away from the expected into the unexpected.

I love this idea:

If you hire a driver, or use a taxi, offer to pay the driver to take you to visit their mother. They will ordinarily jump at the chance. They fulfill their filial duty and you will get easy entry into a local’s home, and a very high chance to taste some home cooking. Mother, driver, and you leave happy. This trick rarely fails.

And those are just the first two!

Via Hacker News

Tags: travel, museums


Damien Bod

Customizing a single client sign-in using parameters in Duende IdentityServer

This post looks at customizing the sign-in UI and the sign-in options in an ASP.NET Core application using Duende IdentityServer and ASP.NET Core Identity. There are multiple ways of changing the look and feel of the UI for different OpenID Connect clients or different client flows. In the previous post, the UI was customized per […]

This post looks at customizing the sign-in UI and the sign-in options in an ASP.NET Core application using Duende IdentityServer and ASP.NET Core Identity. There are multiple ways of changing the look and feel of the UI for different OpenID Connect clients or different client flows. In the previous post, the UI was customized per client, this post customizes inside a single client.

Code: https://github.com/damienbod/duende-multi-tenant

Blogs in the series Multiple client sign-in customizations using Duende identity provider Customizing a single client sign-in using parameters in Duende IdentityServer Setup

The solution is setup using three different ASP.NET Core applications. In the example code, the “admin” application has different federation authentication options compared to the “shop” client authentication sign-in experience. The client ID from the authentication context is used to customize the look and feel, i.e. the styles, the layout and the options of the client are used to define which federation and authentication options are possible. The shop client can be further customized using authentication parameters sent in the OpenID Connect redirect.

OIDC client implementation

In ASP.NET Core the OpenID Connect flow implementation provides multiple events which can be changed or extended. The OnRedirectToIdentityProvider can be used to send custom parameters to the OpenID Connect server. The OAuth Pushed authorization request on top of OpenID Connect Core is used used per default in .NET 9, if the OpenID Connect server supports this.

builder.Services.AddAuthentication(options => { options.DefaultScheme = CookieAuthenticationDefaults.AuthenticationScheme; options.DefaultChallengeScheme = OpenIdConnectDefaults.AuthenticationScheme; }) .AddCookie() .AddOpenIdConnect(options => { options.SignInScheme = CookieAuthenticationDefaults.AuthenticationScheme; // other options ... options.Events = new OpenIdConnectEvents { OnRedirectToIdentityProvider = context => { context.ProtocolMessage.SetParameter("showadminsignin", "false"); return Task.FromResult(0); } }; });

Duende sign-in customization

In the previous post, a new sign-in UI was created for each client. The shop client has further customization. In this demo, the admin external provider can be hidden or displayed depending on what the client requests. The UI is implemented using ASP.NET Core Razor pages and a BindProperty is used for this.

[BindProperty] public bool ShowAdminSignIn { get; set; } = true;

When processes the authentication using Duende and ASP.NET Core Identity, the GetAuthorizationContextAsync method can be used to get the client requested parameters.

var context = await _interaction.GetAuthorizationContextAsync(returnUrl); ShowAdminSignIn = !(context?.Parameters["showadminsignin"] == "false");

The UI can implement any display logic using the client parameters sent in the request. In this demo, the admin UI is hidden or displayed using the client request data.

@foreach (var provider in Model.View.VisibleExternalProviders) { if (!(!Model.ShowAdminSignIn && (provider.AuthenticationScheme == "AdminEntraID"))) { <li class="list-inline-item"> <a class="btn btn-secondary" asp-page="/ExternalLogin/Challenge" asp-route-scheme="@provider.AuthenticationScheme" asp-route-returnUrl="@Model.Input.ReturnUrl"> @provider.DisplayName </a> </li> } }

Notes

This is an easy approach to implement UI customization on a per client basis. The is UI logic and not authorization. The different options are just hidden or removed from the UI using the client parameters, the logic MUST not work if this is a security requirement. This is not authorization.

Links

https://docs.duendesoftware.com/identityserver/v7

https://docs.duendesoftware.com/identityserver/v7/ui/federation/

https://learn.microsoft.com/en-us/aspnet/core/razor-pages


Simon Willison

LLM 0.22, the annotated release notes

I released LLM 0.22 this evening. Here are the annotated release notes: model.prompt(..., key=) for API keys chatgpt-4o-latest llm logs -s/--short llm models -q gemini -q exp llm embed-multi --prepend X Everything else model.prompt(..., key=) for API keys Plugins that provide models that use API keys can now subclass the new llm.KeyModel and llm.AsyncKeyModel

I released LLM 0.22 this evening. Here are the annotated release notes:

model.prompt(..., key=) for API keys chatgpt-4o-latest llm logs -s/--short llm models -q gemini -q exp llm embed-multi --prepend X Everything else model.prompt(..., key=) for API keys
Plugins that provide models that use API keys can now subclass the new llm.KeyModel and llm.AsyncKeyModel classes. This results in the API key being passed as a new key parameter to their .execute() methods, and means that Python users can pass a key as the model.prompt(..., key=) - see Passing an API key. Plugin developers should consult the new documentation on writing Models that accept API keys. #744

This is the big change. It's only relevant to you if you use LLM as a Python library and you need the ability to pass API keys for OpenAI, Anthropic, Gemini etc in yourself in Python code rather than setting them as an environment variable.

It turns out I need to do that for Datasette Cloud, where API keys are retrieved from individual customer's secret stores!

Thanks to this change, it's now possible to do things like this - the key= parameter to model.prompt() is new:

import llm model = llm.get_model("gpt-4o-turbo") response = model.prompt("Surprise me!", key="my-api-key") print(response.text())

Other plugins need to be updated to take advantage of this new feature. Here's the documentation for plugin developers - I've released llm-anthropic 0.13 and llm-gemini 0.11 implementing the new pattern.

chatgpt-4o-latest
New OpenAI model: chatgpt-4o-latest. This model ID accesses the current model being used to power ChatGPT, which can change without warning. #752

This model has actually been around since August 2024 but I had somehow missed it. chatgpt-4o-latest is a model alias that provides access to the current model that is being used for GPT-4o running on ChatGPT, which is not the same as the GPT-4o models usually available via the API. It got an upgrade last week so it's currently the alias that provides access to the most recently released OpenAI model.

Most OpenAI models such as gpt-4o provide stable date-based aliases like gpt-4o-2024-08-06 which effectively let you "pin" to that exact model version. OpenAI technical staff have confirmed that they don't change the model without updating that name.

The one exception is chatgpt-4o-latest - that one can change without warning and doesn't appear to have release notes at all.

It's also a little more expensive that gpt-4o - currently priced at $5/million tokens for input and $15/million for output, compared to GPT 4o's $2.50/$10.

It's a fun model to play with though! As of last week it appears to be very chatty and keen on using emoji. It also claims that it has a July 2024 training cut-off.

llm logs -s/--short
New llm logs -s/--short flag, which returns a greatly shortened version of the matching log entries in YAML format with a truncated prompt and without including the response. #737

The llm logs command lets you search through logged prompt-response pairs - I have 4,419 of them in my database, according to this command:

sqlite-utils tables "$(llm logs path)" --counts | grep responses

By default it outputs the full prompts and responses as Markdown - and since I've started leaning more into long context models (some recent examples) my logs have been getting pretty hard to navigate.

The new -s/--short flag provides a much more concise YAML format. Here are some of my recent prompts that I've run using Google's Gemini 2.0 Pro experimental model - the -u flag includes usage statistics, and -n 4 limits the output to the most recent 4 entries:

llm logs --short -m gemini-2.0-pro-exp-02-05 -u -n 4 - model: gemini-2.0-pro-exp-02-05 datetime: '2025-02-13T22:30:48' conversation: 01jm0q045fqp5xy5pn4j1bfbxs prompt: '<documents> <document index="1"> <source>./index.md</source> <document_content> # uv An extremely fast Python package...' usage: input: 281812 output: 1521 - model: gemini-2.0-pro-exp-02-05 datetime: '2025-02-13T22:32:29' conversation: 01jm0q045fqp5xy5pn4j1bfbxs prompt: I want to set it globally so if I run uv run python anywhere on my computer I always get 3.13 usage: input: 283369 output: 1540 - model: gemini-2.0-pro-exp-02-05 datetime: '2025-02-14T23:23:57' conversation: 01jm3cek8eb4z8tkqhf4trk98b prompt: '<documents> <document index="1"> <source>./LORA.md</source> <document_content> # Fine-Tuning with LoRA or QLoRA You c...' usage: input: 162885 output: 2558 - model: gemini-2.0-pro-exp-02-05 datetime: '2025-02-14T23:30:13' conversation: 01jm3csstrfygp35rk0y1w3rfc prompt: '<documents> <document index="1"> <source>huggingface_hub/__init__.py</source> <document_content> # Copyright 2020 The...' usage: input: 480216 output: 1791 llm models -q gemini -q exp
Both llm models and llm embed-models now take multiple -q search fragments. You can now search for all models matching "gemini" and "exp" using llm models -q gemini -q exp. #748

I have over 100 models installed in LLM now across a bunch of different plugins. I added the -q option to help search through them a few months ago, and now I've upgraded it so you can pass it multiple times.

Want to see all the Gemini experimental models?

llm models -q gemini -q exp

Outputs:

GeminiPro: gemini-exp-1114 GeminiPro: gemini-exp-1121 GeminiPro: gemini-exp-1206 GeminiPro: gemini-2.0-flash-exp GeminiPro: learnlm-1.5-pro-experimental GeminiPro: gemini-2.0-flash-thinking-exp-1219 GeminiPro: gemini-2.0-flash-thinking-exp-01-21 GeminiPro: gemini-2.0-pro-exp-02-05 (aliases: g2)

For consistency I added the same options to the llm embed-models command, which lists available embedding models.

llm embed-multi --prepend X
New llm embed-multi --prepend X option for prepending a string to each value before it is embedded - useful for models such as nomic-embed-text-v2-moe that require passages to start with a string like "search_document: ". #745

This was inspired by my initial experiments with Nomic Embed Text V2 last week.

Everything else
The response.json() and response.usage() methods are now documented.

Someone asked a question about these methods online, which made me realize they weren't documented. I enjoy promptly turning questions like this into documentation!

Fixed a bug where conversations that were loaded from the database could not be continued using asyncio prompts. #742

This bug was reported by Romain Gehrig. It turned out not to be possible to execute a follow-up prompt in async mode if the previous conversation had been loaded from the database.

% llm 'hi' --async Hello! How can I assist you today? % llm 'now in french' --async -c Error: 'async for' requires an object with __aiter__ method, got Response

I fixed the bug for the moment, but I'd like to make the whole mechanism of persisting and loading conversations from SQLite part of the documented and supported Python API - it's currently tucked away in CLI-specific internals which aren't safe for people to use in their own code.

New plugin for macOS users: llm-mlx, which provides extremely high performance access to a wide range of local models using Apple's MLX framework.

Technically not a part of the LLM 0.22 release, but I like using the release notes to help highlight significant new plugins and llm-mlx is fast coming my new favorite way to run models on my own machine.

The llm-claude-3 plugin has been renamed to llm-anthropic.

I wrote about this previously when I announced llm-anthropic. The new name prepares me for a world in which Anthropic release models that aren't called Claude 3 or Claude 3.5!

Tags: projects, ai, annotated-release-notes, openai, generative-ai, chatgpt, llms, llm, anthropic, gemini


John Philpin : Lifestream

🗺️ 🚘 📸 👀

🗺️ 🚘 📸 👀

🗺️ 🚘 📸 👀


🗺️ 🚘 📸 👀

🗺️ 🚘 📸 👀

🗺️ 🚘 📸 👀


🗺️ 🚘 📸 👀

🗺️ 🚘 📸 👀

🗺️ 🚘 📸 👀


🗺️ 🚘 📸 👀

🗺️ 🚘 📸 👀

🗺️ 🚘 📸 👀


A couple o’ quotes via John Naughton … If it’s sent by s

A couple o’ quotes via John Naughton … If it’s sent by ship, it’s cargo. If sent by road, it’s shipment. 💬 Dave Allen We stand at a crossroads. One path leads to despair, the other to destruction. Let’s hope we make the right choice. 💬 Woody Allen

A couple o’ quotes via John Naughton …

If it’s sent by ship, it’s cargo. If sent by road, it’s shipment.

💬 Dave Allen

We stand at a crossroads. One path leads to despair, the other to destruction. Let’s hope we make the right choice.

💬 Woody Allen


Can We Please Stop Calling These People Populists? – Trump

Can We Please Stop Calling These People Populists? – Trumpism is not a working-class revolt against the elites. All I see is one section of the educated elite going after another section of the educated elite. 💬 David Brooks Dear David - of course we can - just as soon as you stop calling those other people ‘Elites’.. they are not. They just happen to have a lot of money and so feel enti

Can We Please Stop Calling These People Populists? – Trumpism is not a working-class revolt against the elites. All I see is one section of the educated elite going after another section of the educated elite.

💬 David Brooks

Dear David - of course we can - just as soon as you stop calling those other people ‘Elites’.. they are not.

They just happen to have a lot of money and so feel entitled to be treated differently to the rest of us.


Talent hits a target no one else can hit; genius hits a ta

Talent hits a target no one else can hit; genius hits a target no one else can see. 💬 Schopenhauer (via John. Naughton) I kinda get it. BUT. If we can’t see it .. how do we know?

Talent hits a target no one else can hit; genius hits a target no one else can see.

💬 Schopenhauer

(via John. Naughton)

I kinda get it. BUT. If we can’t see it .. how do we know?


@_Nat Zone

3月6日 FIN/SUMシンポジウムパネル「デジタル認証、デジタルIDが目指す未来の暮らし、仕事、社会」に出演します

毎年のことですが、今年もFIN/SUMに出ます。錚々たるパネリストの方々をモデレートさせていただきます。日経主催のFIN/SUMシンポジウムパネルです。 17:40-18:30 パネル デジタル認証、デジタル ID が目指す未来の暮らし、仕事、社会 セッション概要 日時 3月6日…

毎年のことですが、今年もFIN/SUMに出ます。錚々たるパネリストの方々をモデレートさせていただきます。日経主催のFIN/SUMシンポジウムパネルです。

17:40-18:30 パネル デジタル認証、デジタル ID が目指す未来の暮らし、仕事、社会 セッション概要 日時3月6日(木)17:40-18:30会場丸ビルカンファレンスホール
(HALL A : 丸ビルホール 7F)タイトルデジタル認証、デジタルIDが目指す未来の暮らし、仕事、社会

生成AIの登場で不正アクセスやなりすまし詐欺などの犯罪が急増するデジタル社会。デジタル認証やデジタルIDは、こうしたリスクを防ぎながら、さまざまな工夫を凝らしてよりよいデジタル社会の実現を目指します。金融サービスから行政のあり方、働き方、暮らしまで、最適化された未来のデジタル社会の姿を探ります。

登壇者(敬称略)

山廣弘佳 大日本印刷情報イノベーション事業部チームリーダー
日下光 xID代表取締役
林達也 デジタル庁アイデンティティアーキテクト
間下公照 ジェーシービー イノベーション統括部市場調査室長
崎村夏彦 OpenID Foundation理事長 (モデレーター)

なお、この直後のセッションも面白いですので合わせてお願いします。

18:40-19:30 政府・日銀本音トーク 〜新時代のマネーとフィンテック セッション概要 日時3月6日(木)18:40-19:30会場丸ビルカンファレンスホール
(HALL A : 丸ビルホール 7F)タイトル政府・日銀本音トーク 〜新時代のマネーとフィンテック

財務省・日銀・金融庁の現場責任者が、テクノロジー x マネーを中心にデジタル金融の未来を語る

登壇者(敬称略)

鳩貝淳一郎 日本銀行 決済機構局 FinTech 副センター長 デジタル通貨検証グループ長
津田夏樹 財務省 理財局国庫課 課長兼デジタル通貨企画官
牛田遼介 金融庁 チーフ・フィンテック・オフィサー
楠正憲 デジタル庁統括官

お申し込みは

お申し込みは公式サイトよりお願いいたします。

その他のプログラム

そのたのプログラムは以下のとおりです。

jp_session_2025

Sunday, 16. February 2025

Ben Werdmüller

Bringing Quote Posts to Mastodon

[Mastodon] Mastodon doesn't have quote posts, but is finally adding them after years of pressure. It's a harder decision than you might think - which is made clear by this excellent post by the team. In order to help mitigate potential abuse, the team has imposed three main requirements: You will be able to choose whether your posts can be quoted at all. You will be notif

[Mastodon]

Mastodon doesn't have quote posts, but is finally adding them after years of pressure. It's a harder decision than you might think - which is made clear by this excellent post by the team.

In order to help mitigate potential abuse, the team has imposed three main requirements:

You will be able to choose whether your posts can be quoted at all. You will be notified when someone quotes you. You will be able to withdraw your post from the quoted context at any time.

Some Mastodon clients fake support now by showing a post in a quoted context whenever it's linked to from another post, but this doesn't have any of the aforementioned properties, and therefore is more susceptible to abuse. And ActivityPub, as yet, doesn't have a great way to represent this either.

So it makes sense that it's taken a while: Mastodon wants to do it correctly to preserve community health, and do it in a standard way that other Fediverse participants can use, too.

I appreciate the transparency and approach. I'd love to see many more updates in this vein.

#Fediverse

[Link]


Silicon Valley Software Engineers Horrified as They Can't Easily Find Jobs Anymore

[Joe Wilkins in Futurism] The job market in the tech industry has been brutal for a little while now, and doesn't show signs of getting easier. "Of all the workers devastated by the carnage, former tech workers in Silicon Valley are having a particularly rough go of it. The region's former software engineers and developers — whose jobs were previously thought to be ironclad

[Joe Wilkins in Futurism]

The job market in the tech industry has been brutal for a little while now, and doesn't show signs of getting easier.

"Of all the workers devastated by the carnage, former tech workers in Silicon Valley are having a particularly rough go of it.

The region's former software engineers and developers — whose jobs were previously thought to be ironclad — are now having to contend with a fiercely competitive job market in one of the most expensive housing markets in the world."

Silicon Valley - which, here as in a lot of places, is incorrectly used to mean the San Francisco Bay Area - is in a bit of a pickle. Mass layoffs have driven down salaries, so many tech companies are quietly firing swathes of workers and re-hiring those seats in order to lower their costs. That's before you get to the actual downsizing, which has sometimes been significant.

And at the same time, living costs are sky-high, and house prices are all but unobtainable. When so many peoples' wealth is tied to the equity in their home, there are two possible outcomes: a significant drop in wealth as prices decline (particularly as fired employees flee for more affordable climes), or a significant increase in inequality as prices continue to climb. Either way, that doesn't look good.

That's a societal problem, but it's also a problem for the tech industry. Who can afford to found a startup when base prices are so high? The demographics of founders are narrowing to the already well off, forcing other founders to look elsewhere.

The solution will have to involve more help (potentially including more startup funding for a wider set of founders) or better jobs in the area. Otherwise Silicon Valley will continue to lose talent to other parts of the country and the world. Tech companies are trying to get their employees to return to the office to counteract this effect, but it simply won't be enough; no RTO policy is compelling enough when you can't afford to buy a house and bring up a family.

That's an opportunity for other ecosystems, but it's one that they will need to intentionally pick up. To date, smart tech ecosystem strategies in other parts of the world have been few and far between - not least because they aim for a similar level of talent density as Silicon Valley rather than embracing a remote, distributed culture.

I openly miss living in the Bay Area and may return in the future, so I have skin in the game. I'm curious to see what happens here.

#Technology

[Link]


John Philpin : Lifestream

🕵💬 Inspired By People | 🌐 Hubert H. Humphrey The Full C

🕵💬 Inspired By People | 🌐 Hubert H. Humphrey The Full Collection : Visual Scroll | 🔍 List Of Posts

🕵💬 Inspired By People | 🌐 Hubert H. Humphrey

The Full Collection : Visual Scroll | 🔍 List Of Posts


Ben Werdmüller

Life in Weeks

Inspired by Gina Trapani, Buster Benson, and others, I started to build my own Life in Weeks page from scratch. It looks pretty cool, and it’s interesting to see my life milestones presented on this scale. But I’m not going to share it with you. As I was building it, it became clear how much personal information I was sharing — and not just my own, but that of my parents, my sister, my partner

Inspired by Gina Trapani, Buster Benson, and others, I started to build my own Life in Weeks page from scratch. It looks pretty cool, and it’s interesting to see my life milestones presented on this scale.

But I’m not going to share it with you.

As I was building it, it became clear how much personal information I was sharing — and not just my own, but that of my parents, my sister, my partner, my child, other members of my extended family. It’s a privacy violation at best and an identity theft goldmine at worst. My life is mine, their lives are theirs, and these things don’t need to be published in public on the web.

This is, perhaps, an area of growth for me: Ben in his twenties would absolutely have published it. But our lives are like a fingerprint; unique to us. Not everything needs to be made available for free to everyone.

The code is pretty simple and the payload is lightweight (unlike Gina and Buster, I haven’t relied on Bootstrap, for example), so I’m going to find another use for it. Maybe a Life in Weeks for characters from my novel? For the web itself? I’ll think about it.


An update on searching for trans-friendly employers who sponsor visas

Last month I asked to hear from trans-friendly employers who sponsor visas, and provided a simple form for interested employers to reach out. In the process, I heard from many individuals: people who were hoping to find new employment in another country, and people who worked for companies that were aligned, who were encouraging their bosses to fill in the form. A quick reminder before we div

Last month I asked to hear from trans-friendly employers who sponsor visas, and provided a simple form for interested employers to reach out. In the process, I heard from many individuals: people who were hoping to find new employment in another country, and people who worked for companies that were aligned, who were encouraging their bosses to fill in the form.

A quick reminder before we dive in: I’m not providing formal legal or financial advice. I’m just trying to point people in the right direction and provide some ideas for relocation for people who want it.

The bad news

Here’s the bad news: today, that form sits empty. While the post was shared far and wide, not a single person has filled it in.

I think there are a few reasons for this. First and foremost, in the current environment, being listed in such a database presents a significant risk, particularly if you’re doing business with US entities. In an environment where the administration is firing employees and cutting contracts for even the barest mention of support for trans people, there’s every reason to believe that the current administration will penalize people and organizations who work with trans people.

So, that’s not great. I’m very sorry to everyone who got their hopes up that I would be able to make direct connections.

The good news

The good news: some countries actively sponsor visas, welcome trans people, and are hiring.

In my personal conversations with people, what jumped out again and again was that emigrating to the Netherlands was a viable route for many people — and particularly those with tech skills (engineering, IT, product management, design, research, and so on).

Reasons include:

Dutch tech companies tend to sponsor visas. (Look for job vacancies written in English and/or on boards like Tech Me Abroad and Relocate.me.) You can relocate with a self-employment visa if you have a skilled background, can demonstrate at least one Dutch-based contract or that you’ll start a tech-based business, and can show that you’ll independently make €1,613 (~$1,700) per month. The DAFT visa lowers the barrier to entry for US entrepreneurs who can commit to invest €4,500 (~$4,700). The Netherlands is a progressive country, even given its current right-wing government. 90-93% of Dutch people speak English fluently.

The Netherlands is also kind of just a neat country: excellent social safety net, great support for culture and the arts, good connectivity to other European countries, and a strong grant support network for mission-driven tech. Amsterdam is a first-class cosmopolitan city, but other centers in the Netherlands are not to be sniffed at, and the country is so small that you can easily take public transit from one to another in less time than it might take you to commute to work by car in the US.

It is not, however, perfect. Much like the US, the Netherlands has had its own racial reckoning; unlike the US, the discourse has often centered on the idea that racism doesn’t happen there. That’s a rich claim from a society where racist tropes like Zwarte Piet are still commonplace, and where women of color are often marginalized. There’s work to be done — although it’s worth asking if this is truly any worse than the US.

Not everybody can relocate, and not everybody has these skills. I’m aware that this is a privileged route that not everybody can take advantage of. It would be better if there was a defined route for everybody who needed to find a safer place to live; it would be better still if a safe place to live was the place they already call home. This situation is sick and sad, and I truly wish that everything was different.

It also comes with an attendant cost. It’s estimated that moving to the Netherlands will set you back between $6-10K. That’s a lot less than one might expect, but it’s obviously a significant barrier for many people. Unfortunately, very little financial support exists for these moves. If you know of grants, mutual aid funds, or community resources that help trans people relocate, please share them. Funding and guidance from those who’ve navigated the process could make all the difference.

Please reach out

In the meantime, I’ll keep looking. If you are a company in a country that is safe for trans people, and you’re looking to hire people from the US who need visa sponsorship, please fill out this form or reach out to me via email. I’m not giving up.


Organizing on decentralized social networks

[Jon Pincus at The Nexus of Privacy] There's an argument that one reason Elon Musk bought Twitter was to reduce its effectiveness as a platform for progressive organizing. Whether you buy that or not, it's clear that the new set of social networks are fertile ground for new and existing movements to make progress. The question is: how? Jon is an experienced organizer and is h

[Jon Pincus at The Nexus of Privacy]

There's an argument that one reason Elon Musk bought Twitter was to reduce its effectiveness as a platform for progressive organizing. Whether you buy that or not, it's clear that the new set of social networks are fertile ground for new and existing movements to make progress.

The question is: how? Jon is an experienced organizer and is here to help out:

"The Nexus of Privacy is planning a series of online discussions and video/phone calls focusing on organizing on decentralized social networks. There's a range of topics to cover, including looking at the tradeoffs between the different platforms for different use cases, brainstorming how organizers can leverage these platforms, easy ways to start exploring, and ways for groups to move as a whole."

There's a form to express interest (which uses CryptPad to support anonymity, which is both new to me and seems like a great platform in itself). If you're interested in organizing using decentralized social networks as a tool, these sessions look like they'll be a good resource.

#Fediverse

[Link]


Tech continues to be political

[Miriam Eric Suzanne] Every single word of this piece resonated for me, from the underlying discomfort to the realization that AI as it currently manifests reflects a kind of fascist mindset in itself: an enclosure movement of culture and diversity that concentrates power into a handful of vendors. This is true of me too: "Based on every conference I’ve attended over the la

[Miriam Eric Suzanne]

Every single word of this piece resonated for me, from the underlying discomfort to the realization that AI as it currently manifests reflects a kind of fascist mindset in itself: an enclosure movement of culture and diversity that concentrates power into a handful of vendors.

This is true of me too:

"Based on every conference I’ve attended over the last year, I can absolutely say we’re a fringe minority. And it’s wearing me out. I don’t know how to participate in a community that so eagerly brushes aside the active and intentional/foundational harms of a technology. In return for what? Faster copypasta? Automation tools being rebranded as an “agentic” web? Assurance that we won’t be left behind?"

I think drawing the line between "tech" and "the web" is important, and this piece captures exactly how I've been feeling about it:

"“Tech” was always a vague and hand-waving field – a way to side-step regulations while starting an unlicensed taxi company or hotel chain. That was never my interest.

But I got curious about the web, a weird little project built for sharing research between scientists. And I still think this web could be pretty cool, actually, if it wasn’t trapped in the clutches of big tech. If we can focus on the bits that make it special – the bits that make it unwieldy for capitalism."

So this post made me (1) feel less alone (2) like I want to be friends with its author. This is a fringe feeling, unfortunately, but if enough of us stick together, maybe we can manifest an alternative.

#Technology

[Link]


Simon Willison

Introducing Perplexity Deep Research

Introducing Perplexity Deep Research Perplexity become the third company to release a product with "Deep Research" in the name. Google's Gemini Deep Research: Try Deep Research and our new experimental model in Gemini, your AI assistant on December 11th 2024 OpenAI's ChatGPT Deep Research: Introducing deep research - February 2nd 2025 And now Perplexity Deep Research, announced on Febr

Introducing Perplexity Deep Research

Perplexity become the third company to release a product with "Deep Research" in the name.

Google's Gemini Deep Research: Try Deep Research and our new experimental model in Gemini, your AI assistant on December 11th 2024 OpenAI's ChatGPT Deep Research: Introducing deep research - February 2nd 2025

And now Perplexity Deep Research, announced on February 14th.

The three products all do effectively the same thing: you give them a task, they go out and accumulate information from a large number of different websites and then use long context models and prompting to turn the result into a report. All three of them take several minutes to return a result.

In my AI/LLM predictions post on January 10th I expressed skepticism at the idea of "agents", with the exception of coding and research specialists. I said:

It makes intuitive sense to me that this kind of research assistant can be built on our current generation of LLMs. They’re competent at driving tools, they’re capable of coming up with a relatively obvious research plan (look for newspaper articles and research papers) and they can synthesize sensible answers given the right collection of context gathered through search.

Google are particularly well suited to solving this problem: they have the world’s largest search index and their Gemini model has a 2 million token context. I expect Deep Research to get a whole lot better, and I expect it to attract plenty of competition.

Just over a month later I'm feeling pretty good about that prediction!

Tags: gemini, ai-agents, ai, llms, google, generative-ai, perplexity, chatgpt

Saturday, 15. February 2025

Simon Willison

Quoting Kevin Samuel

[...] if your situation allows it, always try uv first. Then fall back on something else if that doesn’t work out. It is the Pareto solution because it's easier than trying to figure out what you should do and you will rarely regret it. Indeed, the cost of moving to and from it is low, but the value it delivers is quite high. — Kevin Samuel, Bite code! Tags: uv, astral, python

[...] if your situation allows it, always try uv first. Then fall back on something else if that doesn’t work out.

It is the Pareto solution because it's easier than trying to figure out what you should do and you will rarely regret it. Indeed, the cost of moving to and from it is low, but the value it delivers is quite high.

Kevin Samuel, Bite code!

Tags: uv, astral, python


Run LLMs on macOS using llm-mlx and Apple's MLX framework

llm-mlx is a brand new plugin for my LLM Python Library and CLI utility which builds on top of Apple's excellent MLX array framework library and mlx-lm package. If you're a terminal user or Python developer with a Mac this may be the new easiest way to start exploring local Large Language Models. Running Llama 3.2 3B using llm-mlx If you haven't already got LLM installed you'll need to install

llm-mlx is a brand new plugin for my LLM Python Library and CLI utility which builds on top of Apple's excellent MLX array framework library and mlx-lm package. If you're a terminal user or Python developer with a Mac this may be the new easiest way to start exploring local Large Language Models.

Running Llama 3.2 3B using llm-mlx

If you haven't already got LLM installed you'll need to install it - you can do that in a bunch of different ways - in order of preference I like uv tool install llm or pipx install llm or brew install llm or pip install llm.

Next, install the new plugin (macOS only):

llm install llm-mlx

Now download and register a model. Llama 3.2 3B is an excellent first choice - it's pretty small (a 1.8GB download) but is a surprisingly capable starter model.

llm mlx download-model mlx-community/Llama-3.2-3B-Instruct-4bit

This will download 1.8GB of model weights from mlx-community/Llama-3.2-3B-Instruct-4bit on Hugging Face and store them here:

~/.cache/huggingface/hub/models--mlx-community--Llama-3.2-3B-Instruct-4bit

Now you can start running prompts:

llm -m mlx-community/Llama-3.2-3B-Instruct-4bit 'Python code to traverse a tree, briefly'

Which output this for me:

Here's a brief example of how to traverse a tree in Python:

class Node: def __init__(self, value): self.value = value self.children = [] def traverse_tree(node): if node is None: return print(node.value) for child in node.children: traverse_tree(child) # Example usage: root = Node("A") root.children = [Node("B"), Node("C")] root.children[0].children = [Node("D"), Node("E")] root.children[1].children = [Node("F")] traverse_tree(root) # Output: A, B, D, E, C, F

In this example, we define a Node class to represent each node in the tree, with a value attribute and a list of children. The traverse_tree function recursively visits each node in the tree, printing its value.

This is a basic example of a tree traversal, and there are many variations and optimizations depending on the specific use case.

That generation ran at an impressive 152 tokens per second!

That command was a bit of a mouthful, so let's assign an alias to the model:

llm aliases set l32 mlx-community/Llama-3.2-3B-Instruct-4bit

Now we can use that shorter alias instead:

llm -m l32 'a joke about a haggis buying a car'

(The joke isn't very good.)

As with other models supported by LLM, you can also pipe things to it. Here's how to get it to explain a piece of Python code (in this case itself):

cat llm_mlx.py | llm -m l32 'explain this code'

The response started like this:

This code is a part of a larger project that uses the Hugging Face Transformers library to create a text-to-text conversational AI model. The code defines a custom model class MlxModel and a set of command-line interface (CLI) commands for working with MLX models. [...]

Here's the rest of the response. I'm pretty amazed at how well it did for a tiny 1.8GB model!

This plugin can only run models that have been converted to work with Apple's MLX framework. Thankfully the mlx-community organization on Hugging Face has published over 1,000 of these. A few I've tried successfully:

mlx-community/Qwen2.5-0.5B-Instruct-4bit - 278 MB mlx-community/Mistral-7B-Instruct-v0.3-4bit - 4.08 GB mlx-community/Mistral-Small-24B-Instruct-2501-4bit - 13.26 GB mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit - 18.5 GB mlx-community/Llama-3.3-70B-Instruct-4bit - 40 GB

Mistral-Small-24B really does feel GPT-4 quality despite only needing around 12GB of RAM to run - so it's a good default model if you want to leave space to run other apps.

DeepSeek-R1-Distill-Qwen-32B-4bit is a great model if you want to try out R1 style "reasoning" which outputs a <think>...</think>... block at the start of each response.

That last Llama 70B one needs a machine with 64GB of RAM to run but is very capable - I wrote about trying out the Ollama version of Llama 3.3 70B in I can now run a GPT-4 class model on my laptop back in December.

llm mlx download-model mlx-community/Llama-3.3-70B-Instruct-4bit

For larger models like that I recommend using the llm chat command as that ensures they stay in memory for multiple rounds of the conversation:

llm chat -m mlx-community/Llama-3.3-70B-Instruct-4bit

Activity Monitor shows the process using 37.5GB of RAM while the model is running.

Some of these jokes are actually pretty good!

Chatting with mlx-community/Llama-3.3-70B-Instruct-4bit Type 'exit' or 'quit' to exit Type '!multi' to enter multiple lines, then '!end' to finish > hi How's your day going so far? Is there something I can help you with or would you like to chat? > jokes about weird animals please Here are a few: 1. Why did the axolotl go to therapy? Because it was struggling to come to terms with its identity crisis... and also because it had a lot of "regrowing" to do! 2. What did the aye-aye say to the other aye-aye when they were out on a date? "You've got a finger on the pulse of my heart!" 3. Why did the naked mole rat go to the party? Because it was a "hair-raising" experience and it wanted to get to the "root" of the fun! 4. What do you call a group of flamingos playing instruments? The Pink Floyd... of the avian world, of course! 5. Why did the blobfish go to the doctor? It had a "gut" feeling something was wrong... and also because it was feeling a little "deflated"! Want more? I've got a whole zoo of 'em!

That ran at 8.8 tokens per second.

As with every LLM plugin, all prompts and responses are logged to a SQLite database that can be accessed using the llm logs command.

You can use LLM models from your Python code as well. The README includes notes on how to do this, as does the LLM Python API documentation. TLDR:

import llm model = llm.get_model("mlx-community/Llama-3.2-3B-Instruct-4bit") print(model.prompt("hi").text()) MLX and mlx-lm

The secret sauce behind this new plugin is Apple's MLX library, which as been improving at an extraordinary pace over the past year - it first launched on GitHub just in November of 2023.

The core library is C++, but the MLX team at Apple also maintain a mlx-examples repo with a wealth of Python examples, including the mlx-lm and mlx-whisper packages.

The community around it is growing too. I'm particularly impressed by Prince Canuma's mlx-vlm library for accessing multi-modal vision LLMs - I'd love to get that working as an LLM plugin once I figure out how to use it for conversations.

I've used MLX for a few experiments in the past, but this tweet from MLX core developer Awni Hannun finally convinced me to wrap it up as an LLM plugin:

In the latest MLX small LLMs are a lot faster.

On M4 Max 4-bit Qwen 0.5B generates 1k tokens at a whopping 510 toks/sec. And runs at over 150 tok/sec on iPhone 16 pro.

This is really good software. This small team at Apple appear to be almost single-handedly giving NVIDIA's CUDA a run for their money!

Building the plugin

The llm-mlx plugin came together pretty quickly. The first version was ~100 lines of Python, much of it repurposed from my existing llm-gguf plugin.

The hardest problem was figuring out how to hide the Hugging Face progress bars that displayed every time it attempted to access a model!

I eventually found the from huggingface_hub.utils import disable_progress_bars utility function by piping library code through Gemini 2.0.

I then added model options support allowing you to pass options like this:

llm -m l32 'a greeting' -o temperature 1.0 -o seed 2

So far using a fixed seed appears to produce repeatable results, which is exciting for writing more predictable tests in the future.

For the automated tests that run in GitHub Actions I decided to use a small model - I went with the tiny 75MB mlx-community/SmolLM-135M-Instruct-4bit (explored previously). I configured GitHub Actions to cache the model in between CI runs by adding the following YAML to my .github/workflows/test.yml file:

- name: Cache models uses: actions/cache@v4 with: path: ~/.cache/huggingface key: ${{ runner.os }}-huggingface-

Tags: apple, plugins, projects, ai, github-actions, generative-ai, llms, llm, mlx


John Philpin : Lifestream

🖇️ Should We Cry For Argentina? … my latest newsletter if yo

🖇️ Should We Cry For Argentina? … my latest newsletter if you are interested. A bit different to my usual newsletter themes.

🖇️ Should We Cry For Argentina? … my latest newsletter if you are interested. A bit different to my usual newsletter themes.


A bully is a felony waiting to happen. 💬 Dexter

A bully is a felony waiting to happen. 💬 Dexter

A bully is a felony waiting to happen.

💬 Dexter


🗺️ 🚘 📸 👀 Road Trip

🗺️ 🚘 📸 👀 Road Trip

🗺️ 🚘 📸 👀

Road Trip


🕵💬 Inspired By People | 🌐 Rutherford B. Hayes The Full

🕵💬 Inspired By People | 🌐 Rutherford B. Hayes The Full Collection : Visual Scroll | 🔍 List Of Posts

🕵💬 Inspired By People | 🌐 Rutherford B. Hayes

The Full Collection : Visual Scroll | 🔍 List Of Posts


“She knows it’s someone on the inside.“ Isn’t it alway

“She knows it’s someone on the inside.“ Isn’t it always?

“She knows it’s someone on the inside.“

Isn’t it always?


Boo Ya

Exploring Options and Testing Assumptions

I know. Cheating!


🔗📸 Well. Who knew? - and ‘just’ down the road apparently.

🔗📸 Well. Who knew? - and ‘just’ down the road apparently.

🔗📸 Well. Who knew? - and ‘just’ down the road apparently.

Friday, 14. February 2025

John Philpin : Lifestream

💬

💬

💬


Bill Wendels Real Estate Cafe

Multiply Tiny House Love Story with Million Backyard March

After a post about a college that’s teaching students how to build their own tiny houses for just $5,000 drew a lot of attention, decided… The post Multiply Tiny House Love Story with Million Backyard March first appeared on Real Estate Cafe.

After a post about a college that’s teaching students how to build their own tiny houses for just $5,000 drew a lot of attention, decided…

The post Multiply Tiny House Love Story with Million Backyard March first appeared on Real Estate Cafe.

John Philpin : Lifestream

David Frum is not wrong … 🔗 The ‘Gulf of America’ Is an Ad

David Frum is not wrong … 🔗 The ‘Gulf of America’ Is an Admission of Defeat

This Is Not A Long Form Post ..

A post that points out that all is not what it seems.

.. I know it looks like one - but it isn’t. I am just experimenting - move along now and sincere apologies if I distracted you.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius sollicitudin consequat. Etiam cursus blandit nisl accumsan fermentum. Phasellus faucibus velit non porttitor tincidunt. Ut quis erat ac nibh auctor tempus. Sed a metus sed dui pulvinar dapibus pulvinar et nisl. Sed consectetur, magna sit amet vestibulum dapibus, augue orci dignissim nulla, nec interdum ligula nibh at dui. In in dolor sit amet urna tempor pulvinar. In ut odio et ligula faucibus placerat. Proin pulvinar ex et sagittis molestie. Vestibulum dignissim faucibus diam, quis lacinia lacus mollis et. In fermentum ex quis consectetur semper. Nullam ut metus quam. Suspendisse potenti.


🕵💬 Inspired By People | 🌐 John Stuart Mill The Full Col

🕵💬 Inspired By People | 🌐 John Stuart Mill The Full Collection : Visual Scroll | 🔍 List Of Posts

🕵💬 Inspired By People | 🌐 John Stuart Mill

The Full Collection : Visual Scroll | 🔍 List Of Posts


Simon Willison

files-to-prompt 0.5

files-to-prompt 0.5 My files-to-prompt tool (originally built using Claude 3 Opus back in April) had been accumulating a bunch of issues and PRs - I finally got around to spending some time with it and pushed a fresh release: New -n/--line-numbers flag for including line numbers in the output. Thanks, Dan Clayton. #38 Fix for utf-8 handling on Windows. Thanks, David Jarman. #36 --ignore

files-to-prompt 0.5

My files-to-prompt tool (originally built using Claude 3 Opus back in April) had been accumulating a bunch of issues and PRs - I finally got around to spending some time with it and pushed a fresh release:

New -n/--line-numbers flag for including line numbers in the output. Thanks, Dan Clayton. #38 Fix for utf-8 handling on Windows. Thanks, David Jarman. #36 --ignore patterns are now matched against directory names as well as file names, unless you pass the new --ignore-files-only flag. Thanks, Nick Powell. #30

I use this tool myself on an almost daily basis - it's fantastic for quickly answering questions about code. Recently I've been plugging it into Gemini 2.0 with its 2 million token context length, running recipes like this one:

git clone https://github.com/bytecodealliance/componentize-py cd componentize-py files-to-prompt . -c | llm -m gemini-2.0-pro-exp-02-05 \ -s 'How does this work? Does it include a python compiler or AST trick of some sort?'

I ran that question against the bytecodealliance/componentize-py repo - which provides a tool for turning Python code into compiled WASM - and got this really useful answer.

Here's another example. I decided to have o3-mini review how Datasette handles concurrent SQLite connections from async Python code - so I ran this:

git clone https://github.com/simonw/datasette cd datasette/datasette files-to-prompt database.py utils/__init__.py -c | \ llm -m o3-mini -o reasoning_effort high \ -s 'Output in markdown a detailed analysis of how this code handles the challenge of running SQLite queries from a Python asyncio application. Explain how it works in the first section, then explore the pros and cons of this design. In a final section propose alternative mechanisms that might work better.'

Here's the result. It did an extremely good job of explaining how my code works - despite being fed just the Python and none of the other documentation. Then it made some solid recommendations for potential alternatives.

I added a couple of follow-up questions (using llm -c) which resulted in a full working prototype of an alternative threadpool mechanism, plus some benchmarks.

One final example: I decided to see if there were any undocumented features in Litestream, so I checked out the repo and ran a prompt against just the .go files in that project:

git clone https://github.com/benbjohnson/litestream cd litestream files-to-prompt . -e go -c | llm -m o3-mini \ -s 'Write extensive user documentation for this project in markdown'

Once again, o3-mini provided a really impressively detailed set of unofficial documentation derived purely from reading the source.

Tags: projects, llms, gemini, llm, ai-assisted-programming, generative-ai, ai, webassembly, python, async, datasette, sqlite, litestream


How to add a directory to your PATH

How to add a directory to your PATH Classic Julia Evans piece here, answering a question which you might assume is obvious but very much isn't. Plenty of useful tips in here, plus the best explanation I've ever seen of the three different Bash configuration options: Bash has three possible config files: ~/.bashrc, ~/.bash_profile, and ~/.profile. If you're not sure which one your system i

How to add a directory to your PATH

Classic Julia Evans piece here, answering a question which you might assume is obvious but very much isn't.

Plenty of useful tips in here, plus the best explanation I've ever seen of the three different Bash configuration options:

Bash has three possible config files: ~/.bashrc, ~/.bash_profile, and ~/.profile.

If you're not sure which one your system is set up to use, I'd recommend testing this way:

add echo hi there to your ~/.bashrc Restart your terminal If you see "hi there", that means ~/.bashrc is being used! Hooray! Otherwise remove it and try the same thing with ~/.bash_profile You can also try ~/.profile if the first two options don't work.

This article also reminded me to try which -a again, which gave me this confusing result for datasette:

% which -a datasette /opt/homebrew/Caskroom/miniconda/base/bin/datasette /Users/simon/.local/bin/datasette /Users/simon/.local/bin/datasette

Why is the second path in there twice? I figured out how to use rg to search just the dot-files in my home directory:

rg local/bin -g '/.*' --max-depth 1

And found that I have both a .zshrc and .zprofile file that are adding that to my path:

.zshrc.backup 4:export PATH="$PATH:/Users/simon/.local/bin" .zprofile 5:export PATH="$PATH:/Users/simon/.local/bin" .zshrc 7:export PATH="$PATH:/Users/simon/.local/bin"

Tags: julia-evans, zsh, bash, ripgrep

Thursday, 13. February 2025

Simon Willison

shot-scraper 1.6 with support for HTTP Archives

shot-scraper 1.6 with support for HTTP Archives New release of my shot-scraper CLI tool for taking screenshots and scraping web pages. The big new feature is HTTP Archive (HAR) support. The new shot-scraper har command can now create an archive of a page and all of its dependents like this: shot-scraper har https://datasette.io/ This produces a datasette-io.har file (currently 163KB) which

shot-scraper 1.6 with support for HTTP Archives

New release of my shot-scraper CLI tool for taking screenshots and scraping web pages.

The big new feature is HTTP Archive (HAR) support. The new shot-scraper har command can now create an archive of a page and all of its dependents like this:

shot-scraper har https://datasette.io/

This produces a datasette-io.har file (currently 163KB) which is JSON representing the full set of requests used to render that page. Here's a copy of that file. You can visualize that here using ericduran.github.io/chromeHAR.

That JSON includes full copies of all of the responses, base64 encoded if they are binary files such as images.

You can add the --zip flag to instead get a datasette-io.har.zip file, containing JSON data in har.har but with the response bodies saved as separate files in that archive.

The shot-scraper multi command lets you run shot-scraper against multiple URLs in sequence, specified using a YAML file. That command now takes a --har option (or --har-zip or --har-file name-of-file), described in the documentation, which will produce a HAR at the same time as taking the screenshots.

Shots are usually defined in YAML that looks like this:

- output: example.com.png url: http://www.example.com/ - output: w3c.org.png url: https://www.w3.org/

You can now omit the output: keys and generate a HAR file without taking any screenshots at all:

- url: http://www.example.com/ - url: https://www.w3.org/

Run like this:

shot-scraper multi shots.yml --har

Which outputs:

Skipping screenshot of 'https://www.example.com/' Skipping screenshot of 'https://www.w3.org/' Wrote to HAR file: trace.har

shot-scraper is built on top of Playwright, and the new features use the browser.new_context(record_har_path=...) parameter.

Tags: projects, shot-scraper, playwright, python, scraping


John Philpin : Lifestream

If someone wants to unlock a stolen phone – or retrieve mo

If someone wants to unlock a stolen phone – or retrieve money from an ATM – and only have five guesses, this data suggests they still have a one-in-eight chance of guessing correctly. 💬 Nathan Yau - Flowing Data 🔗 The Complete Article

If someone wants to unlock a stolen phone – or retrieve money from an ATM – and only have five guesses, this data suggests they still have a one-in-eight chance of guessing correctly.

💬 Nathan Yau - Flowing Data

🔗 The Complete Article


The Pragmatic Engineer

The Pulse #124: Software engineering job openings at five-year low?

Data from Indeed shows software developer job openings have seen the biggest decline in five years across the US. What’s the cause, and what’s next? Also: OpenAI responds to DeepSeek, and more.

The Pulse is a series covering insights, patterns, and trends within Big Tech and startups. Notice an interesting event or trend? Send me a message.

Today, we cover:

Industry pulse. OpenAI struggles to shed nonprofit status, Anthropic bans AI from job applications, cuts at Workday, GitHub Copilot releases an agentic mode, and Linus Torvalds tells Linux contributors that social media pressure is bad for development.

Software engineering job openings hit five-year low? There are 35% fewer software developer job listings on Indeed today, than five years ago. Compared to other industries, job listings for software engineers grew much more in 2021-2022, but have declined much faster since. A look into possible reasons for this, and what could come next.

OpenAI responds to DeepSeek threat. Fresh off the shock of a relatively unknown Chinese company dethroning ChatGPT as the most-downloaded AI app on iPhone and Android, OpenAI has shaken itself up, promptly copied DeepSeek’s visualization of how its reasoning model “thinks”, and has shipped Deep Research.

A glimpse from the section “Software engineering job openings hit five-year low?” OpenAI struggles to shed nonprofit status

Yet more drama at OpenAI, where cofounder Elon Musk has made a $97 billion dollar bid to take over the OpenAI nonprofit. A few details to unpack:

Yes, Musk really is one of the several cofounders of OpenAI. In 2015 it was he, Sam Altman, and 9 others (Greg Brockman, Ilya Sutskever, John Schulman, Wojciech Zaremba, Trevor Blackwell, Vicki Cheung, Andrej Karpathy, Durk Kingma, and Pamela Vagata)

Musk contributed around $45M to OpenAI as a donor, as confirmed by OpenAI.

In 2017, things got messy and Musk eventually left OpenAI. This was the year when OpenAI decided to create a for-profit entity, controlled by the nonprofit. As shared by OpenAI, Musk wanted control over the for-profit part, and then later wanted to merge it into Tesla.

Musk created rival X.AI in 2023, and this company has raised $6B of funding.

OpenAI is worth a lot more than $97B; its latest valuation was at $157B in October, and there are rumors of a new raise at or above $300B!

Here’s where OpenAI’s tricky corporate structure comes into play. OpenAI published a diagram showcasing the structure, but leaving out ownership stakes. Here’s my interpretation, focusing on ownership stakes and control:

OpenAI’s ownership structure. It’s messy.

Musk is offering to buy the nonprofit entity, not the for-profit one. Musk’s offer at $97B for a 51% controlling share values OpenAI at around $195B, which is higher than OpenAI’s last valuation of $157B. The company’s board has a fiduciary duty to consider the offer. However, OpenAI is officially a non-profit, so this duty may not apply.

It’s likely that this episode’s main impact will be to highlight the ridiculous, convoluted, corporate structure of the world’s most valuable private company. I wonder if investors may opt against investing more until this structure is tidied up, with the for-profit entity controlled by investors, and not the nonprofit, with ` investors receiving an ownership stake as well.

Anthropic bans AI from job applications

Read more


John Philpin : Lifestream

I was thinking of posting a quick response analysis of 🖇️ th

I was thinking of posting a quick response analysis of 🖇️ this post from yesterday - but not yet.

I was thinking of posting a quick response analysis of 🖇️ this post from yesterday - but not yet.


Interesting to see my BlueSky cross post reach the micro.blo

Interesting to see my BlueSky cross post reach the micro.blog timeline before the native Micro Blog post.

Interesting to see my BlueSky cross post reach the micro.blog timeline before the native Micro Blog post.


🚧🕵💬 Inspired By People | 🌐 Mahatma Gandhi The Full Coll

🚧🕵💬 Inspired By People | 🌐 Mahatma Gandhi The Full Collection : Visual Scroll | 🔍 List Of Posts

🚧🕵💬 Inspired By People | 🌐 Mahatma Gandhi

The Full Collection : Visual Scroll | 🔍 List Of Posts


Ben Werdmüller

A streak, at last

It’s a small thing, but I broke my Apple Fitness longest streak this week. That means my consecutive number of days that I’ve hit my fitness goals on my Apple Watch has been longer than it’s ever been. Here’s why that’s meaningful: my previous longest streak was broken when my mother died, almost four years ago, after a ten-year terminal decline. For part of that time, I thought that my siste

It’s a small thing, but I broke my Apple Fitness longest streak this week. That means my consecutive number of days that I’ve hit my fitness goals on my Apple Watch has been longer than it’s ever been.

Here’s why that’s meaningful: my previous longest streak was broken when my mother died, almost four years ago, after a ten-year terminal decline. For part of that time, I thought that my sister and I might have the illness too, and I’ve watched four other members of my family follow the same journey. It’s been a hard decade or two, and I haven’t been together enough to manage any kind of streak since that awful week in the hospital. Until now.

I’m not saying that nature is healing — grief is with me every single day — but it feels like, in the midst of genuine crises at home, in my family, and in the world, at least something is going right.


Marshmallow Test and Parenting

[Sergey Bogdanov] This is a story about parenting, but also about the importance of considering context inclusively. We've all heard about the Stanford marshmallow experiment, but: "Kids from wealthier families waited longer than kids from low-income families. Not because they had more self-control, but because their environment made waiting feel safer. If you grow up knowin

[Sergey Bogdanov]

This is a story about parenting, but also about the importance of considering context inclusively. We've all heard about the Stanford marshmallow experiment, but:

"Kids from wealthier families waited longer than kids from low-income families. Not because they had more self-control, but because their environment made waiting feel safer. If you grow up knowing there will always be food on the table, waiting for an extra marshmallow isn’t a big deal. But if your life is more uncertain, grabbing what you can when you can make total sense. It’s a survival skill, not a lack of discipline."

It's not just about socioeconomic background; it's also, as the article points out, about building a culture of trust. That's important in families, but also in companies, and everywhere.

"So what’s the takeaway here? It’s simple, really: as parents, we set the tone. Our actions, promises, and reliability shape how our kids see the world. Are we building an environment where they feel safe enough to wait? Or are we teaching them that they need to grab what they can, when they can?"

What can we change to create emotional safety? How can we let them know they're protected? It really matters, in all walks of life.

#Parenting

[Link]


What Elon Musk's son in Trump's Oval Office tells us about politics and family

[Jennifer Gerson at The 19th] Musk has been seen with his child, X Æ A-Xii (or "Lil X" for short), at press conferences. But, as Jennifer Gerson notes here, it's not just a photo op. "Musk, a father of 12, is an avowed pronatalist, or someone who believes declining population rates are a major concern and has committed to work to remedy this by having as many children as pos

[Jennifer Gerson at The 19th]

Musk has been seen with his child, X Æ A-Xii (or "Lil X" for short), at press conferences. But, as Jennifer Gerson notes here, it's not just a photo op.

"Musk, a father of 12, is an avowed pronatalist, or someone who believes declining population rates are a major concern and has committed to work to remedy this by having as many children as possible. He sees part of his life’s work as repopulating the planet with as many children — and exceptional children at that — as possible."

It's also a wild double standard. Imagine the chorus of conservative mocking if a woman had done the same thing. That's not to say that children shouldn't be normalized at work - they really should - but what's happening in the current moment is hardly a symbol of radical inclusion.

"Pronatalism requires that people who are able to carry pregnancies — mostly women — be pregnant for large periods of time. These pregnancies can have a major impact on women’s participation in the workforce and economic mobility."

Finally, as Gerson points out, it's certainly worth mentioning that pro-natalists aren't just in favor of any kind of baby. They're heavy eugenicists, with a focus on certain characteristics, including retrograde ideas like potential IQ. Might this also include, in Musk's case, the 14 words? I couldn't possibly say.

#Democracy

[Link]


John Philpin : Lifestream

A hotel advertises ‘WiFi’ and then delivers this service

A hotel advertises ‘WiFi’ and then delivers this service What should my response be?

A hotel advertises ‘WiFi’ and then delivers this service

What should my response be?


@_Nat Zone

2月27日【緊急対談】ねむ ✕ 崎村夏彦 – AIメタバース時代のアイデンティティ【Digital Space Conference 2025】

2月27日、Digital Space Conference 2025 のサイドイベントで、昨年「今年最も輝いたVTuberベスト10 【MoguLive VTuber Award 2024】で、儒烏風亭らでんさんと同率4位に選出されたバーチャル美少女ねむさんと緊急対談を行います…

2月27日、Digital Space Conference 2025 のサイドイベントで、昨年「今年最も輝いたVTuberベスト10 【MoguLive VTuber Award 2024】で、儒烏風亭らでんさんと同率4位に選出されたバーチャル美少女ねむさんと緊急対談を行います。題して「ねむ ✕ 崎村夏彦 – AIメタバース時代のアイデンティティ」。

AIとメタバースの発展で激変するインターネットで新人類のID/アイデンティティはどこに向かうのか!?目が離せません

Digital Space Conference 2025 の会場でご覧いただけるほか、YouTube Live でも生配信します。ぜひお越しください。

出演: ■ バーチャル美少女ねむ(VTuber/作家)

黎明期の仮想世界で生きる「メタバース原住民」にして、その文化を伝える「メタバース文化エバンジェリスト」として活動。「バーチャルでなりたい自分になる」をテーマに2017年から美少女アイドルとして活動している自称・世界最古の個人系VTuber。ボイスチェンジャーの利用を公言しているにも関わらずオリジナル曲『ココロコスプレ』で歌手デビュー。メタバースの革命性を論じた著書『メタバース進化論』(2022年、技術評論社)で「ITエンジニア本大賞2023」ビジネス書部門”大賞”を受賞。国連の国際会議「IGF京都2023」でも登壇。アバター文化への貢献が認められ、一般社団法人VRMコンソーシアムよりキズナアイ以来史上二人目となる「アバターアワード2022 特別功労賞」受賞。
X : https://x.com/nemchan_nel

■崎村 夏彦(NATコンサルティング合同会社代表 / OpenID Foundation 理事長)

デジタルアイデンティティおよびプライバシーに関する国際標準化を専門とし、全世界で30億人以上に使われる一連の関連国際規格のほか、「デジタルアイデンティティ」(2021, 日経BP社)を著す。米国OpenID Foundation理事長を2011年より、MyData Japan理事長を2019年より、公正取引委員会デジタルスペシャルアドバイザーを2021年より務める。 ISO/IEC JTC 1/SC 27専門委員会(情報セキュリティ, サイバーセキュリティ及びプライバシー保護 アイデンティティ管理とプライバシー技術)委員長。OECDインターネット技術諮問委員会委員。総務省「プラットフォームに関する研究会」、デジタル庁「本人確認ガイドラインの改定に向けた有識者会議」を始めとして、多数の政府関連検討会にも参画。
X : https://x.com/_nat

■Digital Space Conference 2025 について

AI・Web3・メタバース領域の第一線で活躍するスピーカーが、成功の裏側や失敗から得た教訓を赤裸々に共有。来場者は、具体的な事例から得た示唆や新たな発見を持ち帰り、自身の事業に応用することで次世代を共に創造していく。単なる知識の獲得に留まらず、未来への一歩を踏み出すための場を提供します。
公式サイト:https://jdsef.or.jp/event/dsc2025
プレスリリース: https://prtimes.jp/main/html/rd/p/000000020.000125406.html

■日本デジタル空間経済連盟

デジタル空間における経済活動を活性化し、日本経済の健全な発展と豊かな国民生活の実現に寄与します。
X : https://x.com/Dejikeiren


Simon Willison

python-build-standalone now has Python 3.14.0a5

python-build-standalone now has Python 3.14.0a5 Exciting news from Charlie Marsh: We just shipped the latest Python 3.14 alpha (3.14.0a5) to uv and python-build-standalone. This is the first release that includes the tail-calling interpreter. Our initial benchmarks show a ~20-30% performance improvement across CPython. This is an optimization that was first discussed in faster-cpython i

python-build-standalone now has Python 3.14.0a5

Exciting news from Charlie Marsh:

We just shipped the latest Python 3.14 alpha (3.14.0a5) to uv and python-build-standalone. This is the first release that includes the tail-calling interpreter.

Our initial benchmarks show a ~20-30% performance improvement across CPython.

This is an optimization that was first discussed in faster-cpython in January 2024, then landed earlier this month by Ken Jin and included in the 3.14a05 release. The alpha release notes say:

A new type of interpreter based on tail calls has been added to CPython. For certain newer compilers, this interpreter provides significantly better performance. Preliminary numbers on our machines suggest anywhere from -3% to 30% faster Python code, and a geometric mean of 9-15% faster on pyperformance depending on platform and architecture. The baseline is Python 3.14 built with Clang 19 without this new interpreter.

This interpreter currently only works with Clang 19 and newer on x86-64 and AArch64 architectures. However, we expect that a future release of GCC will support this as well.

Including this in python-build-standalone means it's now trivial to try out via uv. I upgraded to the latest uv like this:

pip install -U uv

Then ran uv python list to see the available versions:

cpython-3.14.0a5+freethreaded-macos-aarch64-none <download available> cpython-3.14.0a5-macos-aarch64-none <download available> cpython-3.13.2+freethreaded-macos-aarch64-none <download available> cpython-3.13.2-macos-aarch64-none <download available> cpython-3.13.1-macos-aarch64-none /opt/homebrew/opt/python@3.13/bin/python3.13 -> ../Frameworks/Python.framework/Versions/3.13/bin/python3.13

I downloaded the new alpha like this:

uv python install cpython-3.14.0a5

And tried it out like so:

uv run --python 3.14.0a5 python

The Astral team have been using Ken's bm_pystones.py benchmarks script. I grabbed a copy like this:

wget 'https://gist.githubusercontent.com/Fidget-Spinner/e7bf204bf605680b0fc1540fe3777acf/raw/fa85c0f3464021a683245f075505860db5e8ba6b/bm_pystones.py'

And ran it with uv:

uv run --python 3.14.0a5 bm_pystones.py

Giving:

Pystone(1.1) time for 50000 passes = 0.0511138 This machine benchmarks at 978209 pystones/second

Inspired by Charlie's example I decided to try the hyperfine benchmarking tool, which can run multiple commands to statistically compare their performance. I came up with this recipe:

brew install hyperfine hyperfine \ "uv run --python 3.14.0a5 bm_pystones.py" \ "uv run --python 3.13 bm_pystones.py" \ -n tail-calling \ -n baseline \ --warmup 10

So 3.14.0a5 scored 1.12 times faster than 3.13 on the benchmark (on my extremely overloaded M2 MacBook Pro).

Tags: uv, astral, benchmarks, python


URL-addressable Pyodide Python environments

This evening I spotted an obscure bug in Datasette, using Datasette Lite. I figure it's a good opportunity to highlight how useful it is to have a URL-addressable Python environment, powered by Pyodide and WebAssembly. Here's the page that helped me discover the bug: https://lite.datasette.io/?install=datasette-visible-internal-db&ref=1.0a17#/_internal/catalog_columns?_facet=database_name

This evening I spotted an obscure bug in Datasette, using Datasette Lite. I figure it's a good opportunity to highlight how useful it is to have a URL-addressable Python environment, powered by Pyodide and WebAssembly.

Here's the page that helped me discover the bug:

https://lite.datasette.io/?install=datasette-visible-internal-db&ref=1.0a17#/_internal/catalog_columns?_facet=database_name

To explain what's going on here, let's first review the individual components.

Datasette Lite The Datasette 1.0 alphas This works for plugins, too datasette-visible-internal-db Spotting the bug Fixing the bug URL-addressable Steps To Reproduce Datasette Lite

Datasette Lite is a version of Datasette that runs entirely in your browser. It runs on Pyodide, which I think is still the most underappreciated project in the Python ecosystem.

I built Datasette Lite almost three years ago as a weekend hack project to try and see if I could get Datasette - a server-side Python web application - to run entirely in the browser.

I've added a bunch of features since then, described in the README - most significantly the ability to load SQLite databases, CSV files, JSON files or Parquet files by passing a URL to a query string parameter.

I built Datasette Lite almost as a joke, thinking nobody would want to wait for a full Python interpreter to download to their browser each time they wanted to explore some data. It turns out internet connections are fast these days and having a version of Datasette that needs a browser, GitHub Pages and nothing else is actually extremely useful.

Just the other day I saw Logan Williams of Bellingcat using it to share a better version of this Excel sheet:

The NSF grants that Ted Cruz has singled out for advancing "neo-Marxist class warfare propaganda," in Datasette-Lite: lite.datasette.io?url=https://...

Let's look at that URL in full:

https://lite.datasette.io/?url=https://data-house-lake.nyc3.cdn.digitaloceanspaces.com/cruz_nhs.db#/cruz_nhs/grants

The ?url= parameter there poins to a SQLite database file, hosted on DigitalOcean Spaces and served with the all-important access-control-allow-origin: * header which allows Datasette Lite to load it across domains.

The #/cruz_nhs/grants part of the URL tells Datasette Lite which page to load when you visit the link.

Anything after the # in Datasette Lite is a URL that gets passed on to the WebAssembly-hosted Datasette instance. Any query string items before that can be used to affect the initial state of the Datasette instance, to import data or even to install additional plugins.

The Datasette 1.0 alphas

I've shipped a lot of Datasette alphas - the most recent is Datasette 1.0a17. Those alphas get published to PyPI, which means they can be installed using pip install datasette==1.0a17.

A while back I added the same ability to Datasette Lite itself. You can now pass &ref=1.0a17 to the Datasette Lite URL to load that specific version of Datasette.

This works thanks to the magic of Pyodide's micropip mechanism. Every time you load Datasette Lite in your browser it's actually using micropip to install the packages it needs directly from PyPI. The code looks something like this:

await pyodide.loadPackage('micropip', {messageCallback: log}); let datasetteToInstall = 'datasette'; let pre = 'False'; if (settings.ref) { if (settings.ref == 'pre') { pre = 'True'; } else { datasetteToInstall = `datasette==${settings.ref}`; } } await self.pyodide.runPythonAsync(` import micropip await micropip.install("${datasetteToInstall}", pre=${pre}) `);

Full code here.

That settings object has been passed to the Web Worker that loads Datasette, incorporating various query string parameters.

This all means I can pass ?ref=1.0a17 to Datasette Lite to load a specific version, or ?ref=pre to get the most recently released pre-release version.

This works for plugins, too

Since loading extra packages from PyPI via micropip is so easy, I went a step further and added plugin support.

The ?install= parameter can be passed multiple times, each time specifying a Datasette plugin from PyPI that should be installed into the browser.

The README includes a bunch of examples of this mechanism in action. Here's a fun one that loads datasette-mp3-audio to provide inline MP3 playing widgets, originally created for my ScotRail audio announcements project.

This only works for some plugins. They need to be pure Python wheels - getting plugins with compiled binary dependencies to work in Pyodide WebAssembly requires a whole set of steps that I haven't quite figured out.

Frustratingly, it doesn't work for plugins that run their own JavaScript yet! I may need to rearchitect significant chunks of both Datasette and Datasette Lite to make that work.

It's also worth noting that this is a remote code execution security hole. I don't think that's a problem here, because lite.datasette.io is deliberately hosted on the subdomain of a domain that I never intend to use cookies on. It's possible to vandalize the visual display of lite.datasette.io but it shouldn't be possible to steal any private data or do any lasting damage.

datasette-visible-internal-db

This evening's debugging exercise used a plugin called datasette-visible-internal-db.

Datasette's internal database is an invisible SQLite database that sits at the heart of Datasette, tracking things like loaded metadata and the schemas of the currently attached tables.

Being invisible means we can use it for features that shouldn't be visible to users - plugins that record API secrets or permissions or track comments or data import progress, for example.

In Python code it's accessed like this:

internal_db = datasette.get_internal_database()

As opposed to Datasette's other databases which are accessed like so:

db = datasette.get_database("my-database")

Sometimes, when hacking on Datasette, it's useful to be able to browse the internal database using the default Datasette UI.

That's what datasette-visible-internal-db does. The plugin implementation is just five lines of code:

import datasette @datasette.hookimpl def startup(datasette): db = datasette.get_internal_database() datasette.add_database(db, name="_internal", route="_internal")

On startup the plugin grabs a reference to that internal database and then registers it using Datasette's add_database() method. That's all it takes to have it show up as a visible database on the /_internal path within Datasette.

Spotting the bug

I was poking around with this today out of pure curiosity - I hadn't tried ?install=datasette-visible-internal-db with Datasette Lite before and I wanted to see if it worked.

Here's that URL from earlier, this time with commentary:

https://lite.datasette.io/ // Datasette Lite ?install=datasette-visible-internal-db // Install the visible internal DB plugin &ref=1.0a17 // Load the 1.0a17 alpha release #/_internal/catalog_columns // Navigate to the /_internal/catalog_columns table page &_facet=database_name // Facet by database_name for good measure

And this is what I saw:

This all looked good... until I clicked on that _internal link in the database_name column... and it took me to this /_internal/databases/_internal 404 page.

Why was that a 404? Datasette introspects the SQLite table schema to identify foreign key relationships, then turns those into hyperlinks. The SQL schema for that catalog_columns table (displayed at the bottom of the table page) looked like this:

CREATE TABLE catalog_columns ( database_name TEXT, table_name TEXT, cid INTEGER, name TEXT, type TEXT, "notnull" INTEGER, default_value TEXT, -- renamed from dflt_value is_pk INTEGER, -- renamed from pk hidden INTEGER, PRIMARY KEY (database_name, table_name, name), FOREIGN KEY (database_name) REFERENCES databases(database_name), FOREIGN KEY (database_name, table_name) REFERENCES tables(database_name, table_name) );

Those foreign key references are a bug! I renamed the internal tables from databases and tables to catalog_databases and catalog_tables quite a while ago, but apparently forgot to update the references - and SQLite let me get away with it.

Fixing the bug

I fixed the bug in this commit. As is often the case the most interesting part of the fix is the accompanying test. I decided to use the introspection helpers in sqlite-utils to guard against every making another mistake like this again in the future:

@pytest.mark.asyncio async def test_internal_foreign_key_references(ds_client): internal_db = await ensure_internal(ds_client) def inner(conn): db = sqlite_utils.Database(conn) table_names = db.table_names() for table in db.tables: for fk in table.foreign_keys: other_table = fk.other_table other_column = fk.other_column message = 'Column "{}.{}" references other column "{}.{}" which does not exist'.format( table.name, fk.column, other_table, other_column ) assert other_table in table_names, message + " (bad table)" assert other_column in db[other_table].columns_dict, ( message + " (bad column)" ) await internal_db.execute_fn(inner)

This uses Datasette's await db.execute_fn() method, which lets you run Python code that accesses SQLite in a thread. That code can then use the blocking sqlite-utils introspection methods - here I'm looping through every table in that internal database, looping through each tables .foreign_keys and confirming that the .other_table and .other_column values reference a table and column that genuinely exist.

I ran this test, watched it fail, then applied the fix and it passed.

URL-addressable Steps To Reproduce

The idea I most wanted to highlight here is the enormous value provided by URL-addressable Steps To Reproduce.

Having good Steps To Reproduce is crucial for productively fixing bugs. Something you can click on to see the bug is the most effective form of STR there is.

Ideally, these URLs will continue to work long into the future.

The great thing about a system like Datasette Lite is that everything is statically hosted files. The application itself is hosted on GitHub Pages, and it works by loading additional files from various different CDNs. The only dynamic aspect is cached lookups against the PyPI API, which I expect to stay stable for a long time to come.

As a stable component of the Web platform for almost 8 years WebAssembly is clearly here to stay. I expect we'll be able to execute today's WASM code in browsers 20+ years from now.

I'm confident that the patterns I've been exploring in Datasette Lite over the past few years could be just as valuable for other projects. Imagine demonstrating bugs in a Django application using a static WebAssembly build, archived forever as part of an issue tracking system.

I think WebAssembly and Pyodide still have a great deal of untapped potential for the wider Python world.

Tags: python, urls, datasette, webassembly, pyodide, datasette-lite

Wednesday, 12. February 2025

Simon Willison

Quoting Sam Altman

We want AI to “just work” for you; we realize how complicated our model and product offerings have gotten. We hate the model picker as much as you do and want to return to magic unified intelligence. We will next ship GPT-4.5, the model we called Orion internally, as our last non-chain-of-thought model. After that, a top goal for us is to unify o-series models and GPT-series models by creat

We want AI to “just work” for you; we realize how complicated our model and product offerings have gotten.

We hate the model picker as much as you do and want to return to magic unified intelligence.

We will next ship GPT-4.5, the model we called Orion internally, as our last non-chain-of-thought model.

After that, a top goal for us is to unify o-series models and GPT-series models by creating systems that can use all our tools, know when to think for a long time or not, and generally be useful for a very wide range of tasks.

In both ChatGPT and our API, we will release GPT-5 as a system that integrates a lot of our technology, including o3. We will no longer ship o3 as a standalone model.

[When asked about release dates for GPT 4.5 / GPT 5:] weeks / months

Sam Altman

Tags: generative-ai, openai, o3, chatgpt, ai, llms, sam-altman


Nomic Embed Text V2: An Open Source, Multilingual, Mixture-of-Experts Embedding Model

Nomic Embed Text V2: An Open Source, Multilingual, Mixture-of-Experts Embedding Model Nomic continue to release the most interesting and powerful embedding models. Their latest is Embed Text V2, an Apache 2.0 licensed multi-lingual 1.9GB model (here it is on Hugging Face) trained on "1.6 billion high-quality data pairs", which is the first embedding model I've seen to use a Mixture of Experts arc

Nomic Embed Text V2: An Open Source, Multilingual, Mixture-of-Experts Embedding Model

Nomic continue to release the most interesting and powerful embedding models. Their latest is Embed Text V2, an Apache 2.0 licensed multi-lingual 1.9GB model (here it is on Hugging Face) trained on "1.6 billion high-quality data pairs", which is the first embedding model I've seen to use a Mixture of Experts architecture:

In our experiments, we found that alternating MoE layers with 8 experts and top-2 routing provides the optimal balance between performance and efficiency. This results in 475M total parameters in the model, but only 305M active during training and inference.

I first tried it out using uv run like this:

uv run \ --with einops \ --with sentence-transformers \ --python 3.13 python

Then:

from sentence_transformers import SentenceTransformer model = SentenceTransformer("nomic-ai/nomic-embed-text-v2-moe", trust_remote_code=True) sentences = ["Hello!", "¡Hola!"] embeddings = model.encode(sentences, prompt_name="passage") print(embeddings)

Then I got it working on my laptop using the llm-sentence-tranformers plugin like this:

llm install llm-sentence-transformers llm install einops # additional necessary package llm sentence-transformers register nomic-ai/nomic-embed-text-v2-moe --trust-remote-code llm embed -m sentence-transformers/nomic-ai/nomic-embed-text-v2-moe -c 'string to embed'

This outputs a 768 item JSON array of floating point numbers to the terminal. These are Matryoshka embeddings which means you can truncate that down to just the first 256 items and get similarity calculations that still work albeit slightly less well.

To use this for RAG you'll need to conform to Nomic's custom prompt format. For documents to be searched:

search_document: text of document goes here

And for search queries:

search_query: term to search for

I landed a new --prepend option for the llm embed-multi command to help with that, but it's not out in a full release just yet.

I also released llm-sentence-transformers 0.3 with some minor improvements to make running this model more smooth.

Via @nomic_ai

Tags: embeddings, llm, nomic, ai, rag, uv, python


Building a SNAP LLM eval: part 1

Building a SNAP LLM eval: part 1 Dave Guarino (previously) has been exploring using LLM-driven systems to help people apply for SNAP, the US Supplemental Nutrition Assistance Program (aka food stamps). This is a domain which existing models know some things about, but which is full of critical details around things like eligibility criteria where accuracy really matters. Domain-specific evals

Building a SNAP LLM eval: part 1

Dave Guarino (previously) has been exploring using LLM-driven systems to help people apply for SNAP, the US Supplemental Nutrition Assistance Program (aka food stamps).

This is a domain which existing models know some things about, but which is full of critical details around things like eligibility criteria where accuracy really matters.

Domain-specific evals like this are still pretty rare. As Dave puts it:

There is also not a lot of public, easily digestible writing out there on building evals in specific domains. So one of our hopes in sharing this is that it helps others build evals for domains they know deeply.

Having robust evals addresses multiple challenges. The first is establishing how good the raw models are for a particular domain. A more important one is to help in developing additional systems on top of these models, where an eval is crucial for understanding if RAG or prompt engineering tricks are paying off.

Step 1 doesn't involve writing any code at all:

Meaningful, real problem spaces inevitably have a lot of nuance. So in working on our SNAP eval, the first step has just been using lots of models — a lot. [...]

Just using the models and taking notes on the nuanced “good”, “meh”, “bad!” is a much faster way to get to a useful starting eval set than writing or automating evals in code.

I've been complaining for a while that there isn't nearly enough guidance about evals out there. This piece is an excellent step towards filling that gap.

Tags: evals, llms, ai, generative-ai


The Pragmatic Engineer

Live streaming at world-record scale with Ashutosh Agrawal

In May 2023, a live streaming world record was set with 32 million concurrent viewers watching the finale of the Indian Premier League cricket finale. A chat with the architect behind this system
Stream the Latest Episode

Available now on YouTube, Apple and Spotify. See the episode transcript at the top of this page, and a summary at the bottom.

Brought to You By

WorkOS — The modern identity platform for B2B SaaS

CodeRabbit — Cut code review time and bugs in half. Use the code PRAGMATIC to get one month free.

Augment Code — AI coding assistant that pro engineering teams love

In This Episode

How do you architect a live streaming system to deal with more load than any similar system has dealt with before? Today, we hear from an architect of such a system: Ashutosh Agrawal, formerly Chief Architect of JioCinema (and currently Staff Software Engineer at Google DeepMind.) In May 2023, JioCinema set the live-streaming world record, serving 32 million concurrent viewers tuning in to the finale of Indian Premier League (a cricket game.)

We take a deep dive into video streaming architecture, tackling the complexities of live streaming at scale (at tens of millions of parallel streams) and the challenges engineers face in delivering seamless experiences. We talk about the following topics:

• How large-scale live streaming architectures are designed

• Tradeoffs in optimizing performance

• Early warning signs of streaming failures and how to detect them

• Why capacity planning for streaming is SO difficult

• The technical hurdles of streaming in APAC regions

• Why Ashutosh hates APMs (Application Performance Management systems)

• Ashutosh’s advice for those looking to improve their systems design expertise

• And much more!

Takeaways

My biggest takeaways from this episode:

1. The architecture behind live streaming systems is surprisingly logical. In the episode, Ashutosh explains how the live streaming system works, starting from the physical cameras on-site, through the production control room (PCR), streams being sliced-and-diced, and the HLS protocol (HTTP Live Streaming) used.

2. There are a LOT of tradeoffs you can play with when live streaming! The tradeoffs between server load, latency, server resources vs client caching are hard decisions to make. Want to reduce the server load? Serve longer chunks to clients, resulting in fewer requests per minute, per client… at the expense of clients potentially lagging more behind. This is just one of many possible decisions to make.

3. At massive video streaming scale, capacity planning can start a year ahead! It was surprising to hear how Ashutosh had to convince with telecoms and data centers to invest more in their server infrastructure, so they can handle the load, come peak viewership months later. This kind of challenge will be nonexistent for most of us engineers/ Still, it’s interesting to consider that when you are serving a scale that’s not been done before, you need to worry about the underlying infra!

4. “Game day” is such a neat load testing concept. The team at Jio would simulate “game day” load months before the event. They did tell teams when the load test will start: but did not share anything else! Preparing for a “Game day” test is a lot of work, but it can pay off to find parts of the system that shutter under extreme load.

The Pragmatic Engineer deepdives relevant for this episode

Software architect archetypes

Engineering leadership skill set overlaps

Software architecture with Grady Booch

Timestamps

(00:00) Intro

(01:28) The world record-breaking live stream and how support works with live events

(05:57) An overview of streaming architecture

(21:48) The differences between internet streaming and traditional television.l

(22:26) How adaptive bitrate streaming works

(25:30) How throttling works on the mobile tower side

(27:46) Leading indicators of streaming problems and the data visualization needed

(31:03) How metrics are set

(33:38) Best practices for capacity planning

(35:50) Which resources are planned for in capacity planning

(37:10) How streaming services plan for future live events with vendors

(41:01) APAC specific challenges

(44:48) Horizontal scaling vs. vertical scaling

(46:10) Why auto-scaling doesn’t work

(47:30) Concurrency: the golden metric to scale against

(48:17) User journeys that cause problems

(49:59) Recommendations for learning more about video streaming

(51:11) How Ashutosh learned on the job

(55:21) Advice for engineers who would like to get better at systems

(1:00:10) Rapid fire round

A summary of the conversation The Live Streaming Pipeline

The journey of a live stream starts with the cameras at the event’s venue. These cameras are connected by fiber to a Production Control Room (PCR).

In the PCR, a director selects which camera feeds to use, much like in a movie production.

Source feed (or production feed) is then sent to a contribution encoder. This encoder compresses the high-bandwidth source feed to a more manageable size.

The compressed feed is then transmitted to the cloud using a private peer-to-peer link.

Distribution encoder: prepares the stream in various formats for end-user consumption, such as HLS and DASH.

Over 100 stream variants can be generated for various devices – and up to 500 (!) when different languages are included.

Orchestrator: the one managing the pipeline, from the contribution encoding to the cloud infrastructure. The orchestrator decides which endpoints to push to, the configuration of the distribution encoder, and the CDN endpoints.

Playback URLs: generated by the orchestrator. URLs are specific to the device and format being used.

When a user clicks play, a separate playback system takes over. This system verifies user authorization, deals with encryption, and handles Digital Rights Management (DRM). The playback system then provides the client app with an encrypted URL to stream the content.

Live streaming systems are more complex than Video on Demand (VOD) systems because of the need to manage multiple real-time streams and user authentication and authorization for those streams, all while keeping latency low.

Content Delivery

Content delivery relies on Content Delivery Networks (CDNs).

The core technology used is HLS or DASH, where the video is broken down into segments.

HLS uses a master manifest file (e.g., master.m3u8) that lists different video quality levels. Each quality level refers to a child manifest.

Child manifests list individual video segments. These segments are typically between four to six seconds long.

The client player requests a child manifest every segment duration and the segments that it lists.

CDN: works at the segment level rather than at a millisecond level.

Correctly setting up CDN configurations, such as the Time To Live (TTL) values for the cached segments, is crucial to ensure a smooth stream without stale data.

Latency is introduced at various stages of the live-streaming process. This includes encoding, network transmission, and client-side buffering.

Encoding techniques: using a look-back period, or Group of Pictures (GOP) are used to achieve more efficient compression. The GOP might be 1, ,2 or 4 seconds long.

Client-side buffering is used to give a smoother streaming experience, even if there are small network issues. This means the user might be watching the stream a few seconds behind the real-time live point.

There are trade-offs between latency, smooth playbac,k and infrastructure demands. Reducing the segment duration increases calls to the CDN, impacting infrastructure needs.

Adaptive bitrate streaming is used to adjust the video quality in response to the user's network conditions.

The client-side player measures the download speed and chooses an appropriate video quality level, matching the user's network capacity.

If the network speed slows down, the client can switch to a lower-quality video (e.g., from 720p to 240p).

The server can also degrade the user's stream by limiting the number of available video quality options, for example during very high load. The server can also adjust the segment length in response to system load.

The client player is always starting playback a few seconds behind the live point, to avoid any interruption in playback if a segment is missed.

If a segment is missed on a TV, the TV will continue playing at the live point. However, on the internet, the client is using a buffer and will try to avoid missing a segment.

Monitoring, Metrics, and Scaling

Monitoring is based on leading and trailing indicators.

Leading indicators help to identify potential problems in realtime. Examples include buffer time and playback failure rates. These leading indicator metrics are given priority in the system.

Trailing indicators are used to perform a detailed analysis of issues after they occur.

Client-side metrics are collected and quickly processed by the server in less than a minute or sometimes within 30 seconds.

Server-side metrics, such as bandwidth, the number of requests, and latency, are also tracked.

The frequency of data collection is adjusted based on the system load. When there is higher traffic, the amount of data collected is sampled to manage the volume of data collected and processed.

Capacity planning is a very complex process involving infrastructure, network, and power, and is started at the end of the prior year, for the next year.

Capacity planning involves coordination with several infra providers to make sure they can scale their infrastructure for the events.

The planning focuses on metrics such as compute, RAM, disk, and network usage. The main metric that becomes the limiting factor is vCPUs.

Cloud resources are not infinite at the scale required for major live events. There is a finite amount of resources in a given location – at this scale of streaming, that is!

Providers need to purchase real estate, install links, and deploy servers.

Horizontal scaling is preferred for compute resources as it is easy to add boxes to the pool.

Databases and caches are scaled preemptively to avoid the need to scale them on the fly during events.

Auto-scaling is not effective for live events because it is too slow to respond to the rapid changes in traffic. Custom scaling systems are preferred.

The custom scaling system uses a concurrency metric, that is the number of users watching the stream, to scale services. All systems are scaled against a common concurrency metric.

The custom scaler also looks at user journeys, such as when users press the back button and return to the home page. This can cause a spike in traffic to the home page API.

APAC-specific live streaming challenges

Mobility is a significant challenge because most users in India watch live streams on mobile devices and are often on the move. This means that users are constantly switching between cell towers.

Battery consumption is also a key factor. Video streaming can quickly drain mobile phone batteries.

The video profile, polling frequency, and encoding algorithms are often chosen to reduce battery consumption.

“Game day simulation”: something Jio did to simulate peak load conditions.

Involved in generating synthetic traffic and the teams needed to scale systems and follow operational protocols in response.

The teams did not have access to the traffic dashboard, so the traffic patterns were unknown to the teams.

Advice for engineers to become better architects

Understand this: anything that can fail, will fail. Overconfidence in systems can lead to problems! Most people underestimate or overlook the potential failure modes.

Look at every aspect of your system including configurations and code as even the smallest things can cause problems.

Detailed metrics and measurements are vital. Both to see potential problems and to be able to debug effectively.

Ensure you are measuring metrics correctly. For example, response time should be measured from when the request is queued, not when it enters the processing function.

Do not rely too heavily on APMs. It is better to understand the low-level details and measure and fine-tune every aspect of your code.

To learn more about video encoding: look up documentation on GitHub and online. Look for resources going into how image compression is done, and how images are turned into video.

Most of the learning happens on the job. There isn't a lot of public information about problems at this kind of scale! Hopefully, this podcast was helpful in sharing more details!

Resources & Mentions

Where to find Ashutosh Agrawal:

• X: https://x.com/theprogrammerin

• LinkedIn: https://www.linkedin.com/in/theprogrammerin/

• Medium: https://medium.com/@theprogrammerin

Mentions during the episode:

• Disney+ Hotstar: https://www.hotstar.com/in

• What is a CDN: https://aws.amazon.com/what-is/cdn/

• Adaptive bitrate streaming: https://en.wikipedia.org/wiki/Adaptive_bitrate_streaming

• Skype: https://www.skype.com/en/

•Millions Scale Simulations: https://blog.hotstar.com/millons-scale-simulations-1602befe1ce5

• Black Friday: https://en.wikipedia.org/wiki/Black_Friday_(shopping)

• Asia-Pacific (APAC): https://en.wikipedia.org/wiki/Asia%E2%80%93Pacific

• Distributed architecture concepts I learned while building a large payments system: https://blog.pragmaticengineer.com/distributed-architecture-concepts-i-have-learned-while-building-payments-systems/

• Concurrency: https://web.mit.edu/6.005/www/fa14/classes/17-concurrency/

• Video streaming resources on Github: https://github.com/leandromoreira/digital_video_introduction

• Murphy’s Law: https://en.wikipedia.org/wiki/Murphy%27s_Law_(disambiguation)

• Java: https://www.java.com/

• Ruby: https://www.ruby-lang.org/en/

• Ruby on Rails: https://rubyonrails.org/

• Hacker News: https://news.ycombinator.com/

Production and marketing by Pen Name. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com.


John Philpin : Lifestream

🕵💬 Inspired By People | 🌐 John Adams The Full Collectio

🕵💬 Inspired By People | 🌐 John Adams The Full Collection : Visual Scroll | 🔍 List Of Posts

🕵💬 Inspired By People | 🌐 John Adams

The Full Collection : Visual Scroll | 🔍 List Of Posts


Doc Searls Weblog

It’s still adtech

This morning Wired published This Ad-Tech Company Is Powering Surveillance of US Military Personnel. It’s a good piece, which is typical of Wired lately. But what caught my eye was “Ad-Tech” in the headline. Some writers say “ad tech.” Others say “adtech” or “AdTech.” I’m highly leveraged in “adtech,” since I’ve now posted 160 pieces (including […]

This morning Wired published This Ad-Tech Company Is Powering Surveillance of US Military Personnel. It’s a good piece, which is typical of Wired lately. But what caught my eye was “Ad-Tech” in the headline. Some writers say “ad tech.” Others say “adtech” or “AdTech.”

I’m highly leveraged in “adtech,” since I’ve now posted 160 pieces (including this one) using that expression since 2008.

So I did a Google Trends search on the three uses, and saw that “adtech” still wins, by a hair, though “ad tech” is close, and was ahead for much of the ‘teens.

The more interesting thing is that interest in the topic itself dropped by 75% between 2004 and th e mid-teens, and has kinda flattened since then.

But it’s still an issue.

At least I am relieved of the need to go back and change all those posts.


Don’t Buy This

Every so often a product shows up that is so bad somebody needs to sound a warning. So I’m sounding one for the Ion Retro Glow. For the last month or so, it’s been on display and selling at the Sams Club here in Bloomington, Indiana. That’s where I shot the photo above. At first I […]
Looks right, works wrong.

Every so often a product shows up that is so bad somebody needs to sound a warning.

So I’m sounding one for the Ion Retro Glow. For the last month or so, it’s been on display and selling at the Sams Club here in Bloomington, Indiana. That’s where I shot the photo above.

At first I thought it was just an interesting goof: an old-fashioned boom box with Bluetooth and a rechargeable battery. But then I remembered that I have about a hundred cassette tapes I want to go through, and no way to play them. I also thought the thing wouldn’t be a bad Bluetooth speaker system for my phone. The sound, while not great, wasn’t bad when I pressed the buttons on the display. I also still listen to some radio, so I thought that might be a nice bonus, if the radio was at least okay. And the price was right: $69.

So I bought one.

OMG. Quelle merde.

First, all the buttons and knobs are crap plastic, and the Stop/Eject button wouldn’t open the cassette door, making that function useless. Right there I needed to take it back. But to exchange it?

Not after checking out the radio.

The dial knob was so stiff I thought something was surely wrong with it. And there was. The dial pointer leaned one way or the other as one tuned up and down. And the radio was so bad on both bands that it would be a total turn-off to anyone wanting to give radio a try for the first time. (And there are now large demographics in that category.) Noise rejection on AM was completely absent, and the radio only got one station. There wasn’t even a tiny hint that there might be other signals out there. (And there are many here.) On FM, strong local signals splattered all over the dial, and the tuner made strange pulsing sounds as I (with lots of torque) turned the dial pointer slowly (there was no other way) up and down. And the sound kinda sucked too.

The only thing that worked as advertised was the Bluetooth. My phone found the thing instantly.

As for the sound, I’m much better served by the Bluetooth speakers I already have. There are good ones in the same price range.

The glowing rings around the speakers are just eye candy.

I suppose the rechargeable battery is okay. It worked for the few minutes I bothered checking it out.

The lesson: If you want a good boom box, check out your neighborhood or online thrift stores. Here’s Facebook Marketplace for my region here.

I suspect at least one of these is still good.

A final word to Sam’s Club: please stop selling this piece of crap. It makes victims of your customers.


Jon Udell

Web Components

The JavaScript industrial complex won’t crumble anytime soon. But the stage is set for a return to an ecosystem of reusable components accessible to business developers, only this time based on the universal web platform and its core standards. How To Build Web Components Using ChatGPT LLM series at The New Stack

The JavaScript industrial complex won’t crumble anytime soon. But the stage is set for a return to an ecosystem of reusable components accessible to business developers, only this time based on the universal web platform and its core standards.

How To Build Web Components Using ChatGPT

LLM series at The New Stack

Tuesday, 11. February 2025

Ben Werdmüller

The Social Web Foundation announces its membership in the World Wide Web Consortium

[Social Web Foundation] The Social Web Foundation has joined the W3C: "SWF joins CDT as one of the few civil society organizations that comprise the Consortium. SWF’s membership in the W3C underscores our commitment to promoting an open and federated social web, and our alignment with the W3C mission to develop web standards through community consensus that ensures the long-

[Social Web Foundation]

The Social Web Foundation has joined the W3C:

"SWF joins CDT as one of the few civil society organizations that comprise the Consortium. SWF’s membership in the W3C underscores our commitment to promoting an open and federated social web, and our alignment with the W3C mission to develop web standards through community consensus that ensures the long-term growth and accessibility of technical specifications that are openly licensed."

More forward motion for the open social web as a core part of the open web itself. This is also very good news:

"In terms of concrete ongoing work, we look forward to bringing end-to-end encryption to direct messages in ActivityPub, developing groups on the social web, supporting data portability with ActivityPub, making discovery of ActivityPub objects and their authors easier."

These will all make the open social web safer, more flexible, and easier to build on for new platform entrants. Let's go.

#Fediverse

[Link]


Simon Willison

llm-sort

llm-sort Delightful LLM plugin by Evangelos Lamprou which adds the ability to perform "semantic search" - allowing you to sort the contents of a file based on using a prompt against an LLM to determine sort order. Best illustrated by these examples from the README: llm sort --query "Which names is more suitable for a pet monkey?" names.txt cat titles.txt | llm sort --query "Which book should

llm-sort

Delightful LLM plugin by Evangelos Lamprou which adds the ability to perform "semantic search" - allowing you to sort the contents of a file based on using a prompt against an LLM to determine sort order.

Best illustrated by these examples from the README:

llm sort --query "Which names is more suitable for a pet monkey?" names.txt cat titles.txt | llm sort --query "Which book should I read to cook better?"

It works using this pairwise prompt, which is executed multiple times using Python's sorted(documents, key=functools.cmp_to_key(compare_callback)) mechanism:

Given the query: {query} Compare the following two lines: Line A: {docA} Line B: {docB} Which line is more relevant to the query? Please answer with "Line A" or "Line B".

From the lobste.rs comments, Cole Kurashige:

I'm not saying I'm prescient, but in The Before Times I did something similar with Mechanical Turk

This made me realize that so many of the patterns we were using against Mechanical Turk a decade+ ago can provide hints about potential ways to apply LLMs.

Via lobste.rs

Tags: llm, plugins, generative-ai, ai, llms, python, mechanical-turk


Ben Werdmüller

Tumblr to join the fediverse after WordPress migration completes

[Sarah Perez at TechCrunch] This was a very nice surprise to see: "Automattic confirmed to TechCrunch that when the migration is complete, every Tumblr user will be able to federate their blog via ActivityPub, just as every WordPress.com user can today." ActivityPub is the open protocol behind Mastodon and Pixelfed, among others. Threads and Ghost have also been steadily a

[Sarah Perez at TechCrunch]

This was a very nice surprise to see:

"Automattic confirmed to TechCrunch that when the migration is complete, every Tumblr user will be able to federate their blog via ActivityPub, just as every WordPress.com user can today."

ActivityPub is the open protocol behind Mastodon and Pixelfed, among others. Threads and Ghost have also been steadily adding support.

Given the long tail of ActivityPub and the simultaneous rise of Bluesky, which is connected to the ActivityPub network via Bridgy Fed, the future of the open social web is very bright. It is the future of all social media. This is another great milestone along the road.

#Fediverse

[Link]


Phil Windleys Technometria

The Business Case for Dynamic Authorization

Authorization isn't just about security.

Access management is seen by many business leaders as primarily a means of protecting and securing computer systems. Important, but of secondary importance to the business. But as computer systems come to intermediate almost every interaction the business has with employees, contractors, partners, and customers, dynamic authorization should be seen as a strategic business enabler. Investing in a robust, flexible, and pervasive authorization infrastructure can drive revenue growth, operational efficiency, regulatory compliance, and competitive differentiation.

Reducing Operational Costs

Manually managing access using rigid, static authorization models like ACLs and groups is labor-intensive and prone to errors. Organizations that rely on static methods often have employees who are dedicated to managing permissions for employees and others. These employees also perform manual audits, track down problems, and manage groups. As the organization grows, these processes become more complex—scaling superlinearly due to interactions in the system.

Dynamic authorization automates many of these access control decisions, reducing the need for manual intervention. This has several benefits:

Lower administrative overhead—eliminating the need for manually managing permissions and groups reduces administrative costs.

Reduced risk of over-permissioning—accounts with permissions they no longer need are the source of many costly security breaches.

Reduced security insurance premiums—many organizations buy costly insurance for security breaches and ransomware. Better authorization practices and systems can reduce premiums.

Fewer support tickets for access problems—tickets that require IT to drop what they’re doing to sort out a permissioning problem take these critical employees away from work that advances the organization’s products and services.

Improved onboarding and offboarding efficiency—dynamic authorization can ensure new employees or contractors have all the access they need on day one and lose it as soon as they leave.

Improved operational efficiency gives the organization the freedom to explore and grow instead of constantly battling access management problems.

Enabling Business Agility and Innovation

As more and more business is conducted online, organizations are finding that it’s vital to quickly react to changing business needs. Whether an organization is launching a new product, expanding into new markets, reacting to new regulatory requirements, or enabling new partnerships, being able to flexibly adapt to emerging requirements and support innovation is table stakes for successful organizations.

As we’ve discussed, static authorization methods require manual changes to lists and groups to increase or decrease access to systems. For example, a financial services firm that is employing external auditors for compliance reviews must grant access for the duration of the engagement. A dynamic authorization system makes this as easy as a policy change. Even that might not be required if authorization policies have been written so as to anticipate this kind of need.

New products often require custom code to support authorization requirements for customers and administrators. A workforce management service provider launching a new employee onboarding product must ensure that customers can properly authorize varying access levels for their employees to administer and manage the service securely. A dynamic authorization system can be integrated with the new product, allowing developers to build in the right authorization controls without writing custom authorization code.

Improving Customer Experience

The compelling features of modern SaaS applications, marketplaces, and collaborative services depend on carefully controlling access. In these platforms, dynamic authorization isn’t just for security; it also enhances the user experience and provides enhanced revenue opportunities.

For example, platforms like Google Docs, Dropbox, and SmugMug all allow their customers to share content with specific people or groups. Dynamic authorization makes this functionality possible.

Or consider multi-tenant SaaS companies like Workday or Salesforce. Fine-grained authorization allows these companies to isolate customer data while simultaneously allowing granular access within each tenant that follows complex rules. These companies can’t build a single access management system because each tenant requires different access controls depending on their organization, regulatory environment, and internal access policies. Dynamic authorization lets them more easily meet customer needs.

Finally, many online businesses offer different features to different users depending on subscription levels. Dynamic authorization policies allow this to be done without custom code and give the business flexibility to add or modify subscription levels and features without changing the underlying code.

Strengthening Security and Compliance

Using dynamic authorization for improved security and regulatory compliance provides several business advantages. Industries like finance, healthcare, and government are heavily regulated. The regulations require organizations to enforce least privilege access, ensure auditability, and dynamically adjust access based on employee roles and changing risk conditions. Organizational benefits from using dynamic authorization include decreased compliance risk, better employee experience, fewer workarounds that introduce security problems, and reduced overall cost.

Competitive Differentiation

Using dynamic authorization inside products gives organizations a competitive edge by offering a more secure, flexible, and user-friendly product.

For example, a B2B SaaS company with a product built with fine-grained access control can better attract large enterprise customers who demand flexible, yet secure features. A financial services company that lets customers dynamically set transaction limits based on varying risk signals allows them to reduce fraud while maintaining a rich user experience. A collaboration tool that offers flexible, secure content sharing beats out competitors who use more rigid, static sharing models.

Organizations can more easily respond to competitor product changes when access management is as simple as a policy change. And dynamic authorization provides these benefits without developers having to write custom code.

A Business Necessity

The preceding sections offer multiple examples of how dynamic authorization goes well beyond enhanced IT security. Organizations that embrace dynamic authorization gain enhanced operational efficiency through automation, increased business agility to more easily pursue new opportunities, stronger security and compliance with less overhead, and better customer experiences that drive customer engagement and revenue.

In the era of multi-tenant, AI-enhanced, SaaS applications, dynamic authorization is essential for organizations to securely scale and effectively compete. Failing to adopt better access management technologies and mechanisms puts organizations at risk of losing their competitive advantage.

Photo Credit: Octopus busy approving things from DALL-E (public domain) Prompt: Draw a picture of an octopus wearing an officials hat with each arm holding stamps that say either "allow" or "deny". The octopus is stamping multiple sheets of paper on a desk.


Heres Tom with the Weather

Bitcoin Science

Bitcoin failed. J.P. Koning notes in The end of El Salvador’s bitcoin payments experiment: But here was a government that was going to champion the stuff, nullifying all of the headwinds against bitcoin in one stroke! The government meddling hypothesis would be put to test. The Salvadoran government used a combination of sticks and carrots to kick-start adoption. First, let's list the ca

Bitcoin failed. J.P. Koning notes in The end of El Salvador’s bitcoin payments experiment:

But here was a government that was going to champion the stuff, nullifying all of the headwinds against bitcoin in one stroke! The government meddling hypothesis would be put to test. The Salvadoran government used a combination of sticks and carrots to kick-start adoption. First, let's list the carrots. The capital gains tax on bitcoin was set to zero to remove the hassle of buying stuff with bitcoin. The government also built a bitcoin payments app, Chivo, for all El Salvadoreans to use. (Chivo also supports U.S. dollar payments.) Anyone who downloaded Chivo and transacted with bitcoin would receive a $30 bitcoin bonus—that's a lot of money in El Salvador. Gas stations offered $0.20 off of a gallon of gas for customers who paid with the app. People could also use Chivo to pay their taxes with bitcoin. The biggest carrot was zero-transaction fees. Any payment conducted with Chivo was free, as was converting bitcoins held in the Chivo app into U.S. dollars and withdrawing cash at Chivo ATMs. These Chivo ATMs were rolled out across El Salvador and in the U.S., too, to encourage the nascent U.S.-to-El Salvador bitcoin remittance route. Bitcoin ATMs are usually incredibly pricey to use, but in El Salvador the government would eat all the transaction fees. What a fantastic deal. As for the stick, Bukele introduced a forced-tender rule. Beginning in 2021, businesses were required to accept the orange coin or be punished. This was costly for them to comply with. They would have to update point of sale software, signage, train employees, and set up new processes for handling bitcoins post-sale. By all rights, this combination of sticks and carrots should have led to a flourishing of bitcoin payments. But it didn't.

Koning concludes:

The saddest thing about El Salvador's bitcoin experiment is that all sorts of time and resources have been wasted. El Salvador is not a rich country. The money spent on building and operating Chivo, compliance by businesses, bitcoin signage, and subsidies could have been better deployed on more important things like health and education. One hopes that other countries learn from this experience and avoid going down the same route that El Salvador did.

The Pragmatic Engineer

Is Big Tech becoming more cutthroat?

Is it the end of a golden age of Big Tech, when jobs at leading companies offered high-impact work, top compensation, and good work-life balance? More signs suggest harsh perf management is the norm

A few years ago, companies like Google, Microsoft and Facebook were known as places where – inside certain organizations – some engineers could earn large sums of money for doing comparatively little work, and wile away the work week waiting for their large new joiner equity to vest over four years. This chill approach is what the “rest” in rest-and-vest refers to, which was a popular phrase at the time for it. These places also had many teams where work was laid back, and it was possible to “coast” and get by doing relatively little.

In 2017, Business Insider interviewed several such folks and wrote:

“Business Insider talked to about a half a dozen people with direct knowledge of the rest-and-vest culture. Some were "fat cats" themselves. Some were hiring managers who tried to lure these folks back to the world of productivity. Many acknowledged that resting and vesting was a common, hush-hush practice at their own companies. Internally, these people are often referred to as "coasters."

Their lives counter the other reality for many in the tech world: long work hours and pressure for workers to pledge unrelenting devotion to their companies and jobs above all else.”

A culture of lenient performance management at the biggest tech businesses contributed to laidback work patterns; I talked with managers at the likes of Google and Microsoft at the time who were frustrated that the system made it hard to manage out folks who were visibly checked out, and were hard to motivate to do even the bare minimum work.

Fast forward to today, and there are signs that Big Tech employers are being tougher than ever in performance management, and any tolerance of “rest and vest” culture is history. This article covers:

Meta: first performance-based mass layoffs. Nearly 3,700 people have been dismissed as ‘low performers’; it’s the first time that another reason wasn’t given for layoffs by the social media giant.

Microsoft: performance-based firings are back. Employees terminated on the spot for performance reasons, without warning, and some don’t get severance.

Evolution of Microsoft’s performance management. The Windows maker has been brutal in performance management before: its old stack ranking system was unpopular for close to two decades.

Even without stack ranking, there’s still bucketing. Stack ranking fell out of favor years ago, but bucketing is still how performance reviews work.

Why now? The end of zero rates, a cooling job market, and more. Takeaways from recent mass layoffs.

Companies that always had high performance expectations. Some Big Tech businesses have always had strict performance standards. For example, within Amazon, Netflix, and NVIDIA, little will change.

Related to this article is our two-part deepdive into How performance calibrations are done at tech companies.

1. Meta: first performance-based mass layoffs

Meta did no mass layoffs for its first 18 years of its existence, until November 2022 when it let go 13% of staff. Back then, there were business reasons. I wrote at the time:

Meta’s revenue is simply not growing fast enough. In Meta’s Historic Growth Challenge, I suggested that just by allowing attrition to slightly reduce headcount expenses, Meta could catch up with profitability. However, what I did not account for was how attrition was likely too slow to wait for, and not fully within the company’s control. Layoffs, however, are.

The stock price pressure likely became too much. Meta’s stock price dipped under $100 for the first time on Thursday, 27 October, a week before layoffs. The stock dropped by 26% that day, which was the biggest one-day fall, after the company forecast weaker-than-expected revenue growth for Q4 2022. (...)

Zuck has summarized these layoffs:

“We’ve cut costs across our business, including scaling back budgets, reducing perks, and shrinking our real estate footprint. We’re restructuring teams to increase our efficiency. But these measures alone won’t bring our expenses in line with our revenue growth, so I’ve also made the hard decision to let people go.”

Six months later in early 2023, the company reduced headcount by another 11%, letting go 10,000 people The reasoning was that it had overhired during the pandemic years of 2020-2021, and was too bloated. The layoffs flattened the organization and boosted efficiency.

That was two years ago, and since then Meta has become more efficient: it generates more revenue ($164B per year) and profit ($62B) than ever before, and its value is at an all-time high of $1.8 trillion dollars.

It’s in this context that Meta announces its first-ever performance-based mass layoffs. Five percent of staff are expected to be let go, starting this week with around 3,700 people. An internal email from Mark Zuckerberg explains why, as reported by CNBC:

“I’ve decided to raise the bar on performance management and move out low performers faster. We typically manage out people who aren’t meeting expectations over the course of a year, but now we’re going to do more extensive performance-based cuts during this cycle, with the intention of back filling these roles in 2025.

We won’t manage out everyone who didn’t meet expectations for the last period if we’re optimistic about their future performance, and for those we do let go, we’ll provide generous severance in line with what we provided with previous cuts.”

This clarity that it’s “low performers” who are being laid off, is new. The large mass layoffs of 2022-23 were justified differently. Of course, low performers are at risk of being let go in most circumstances. However, in Meta’s previous layoffs, plenty of high-performers were also cut who worked in teams seen as bloated cost centers, or targeted for sharp headcount drops.

While these cuts are unfortunate, Meta at least is offering generous severance to those impacted: 16 weeks of pay and an additional two weeks for each year of service.

2. Microsoft: performance-based firings are back

Meta isn’t the only tech giant terminating employees based on performance concerns; Microsoft is doing the same — except on an individual basis. Also from Business Insider:

“Microsoft has started performance-based job cuts, according to termination letters viewed by Business Insider.

Microsoft is taking a stronger stance on performance management like its competitors, and managers at the company have spent the past few months evaluating employees all the way up to level 80, one of its highest levels.”

One of several termination letters was reported by Business Insider. It reads:

"The reason(s) for the termination of your employment include that your job performance has not met minimum performance standards and expectations for your position… You are relieved of all job duties effective immediately and your access to Microsoft systems, accounts, and buildings will be removed effective today. You are not to perform any further work on behalf of Microsoft."

Just to repeat, performance-related firing is commonplace, but what’s different here is how short and quick the process is. Previously, most of Big Tech followed a standard process for workers seen as in need of improvement:

Feedback from the manager, often at a biannual performance review

Performance improvement plan (PIP) which formalises why someone is not meeting performance expectations, and how to succeed with the plan

Dismissal upon failure to clear the PIP. Big Tech has been known for generous severance packages which exceed the legal minimum

But now, Microsoft seems to be skipping PIPs and also not offering severance in some cases. This is unusual, given how differently the tech giant had treated employees since Satya Nadella became CEO. It also feels unusually petty to cancel severance packages for those affected, especially as Microsoft is reporting record profits. Is it a message to low performers to expect nothing from the company?

Microsoft getting “cutthroat” in its performance-management is also out of character, as it was Nadella who introduced a more lenient performance management approach, back in 2014.

3. Evolution of Microsoft’s performance management

Between the 1990s and 2013, Microsoft used a stack ranking system for performance management, which wasn’t advertised to employees until the mid-2000s – although many knew about Microsoft’s “vitality curve” for ranking engineers and managers. Under this, workers high on the curve got outsized bonuses and pay rises, and those low down the curve; well, they got managed out.

In 2004, Mini Microsoft (an anonymous employee at the company, blogging in the public) wrote a post explaining how the then still-secretive stack ranking worked:

“Then along came a new lead. Her feedback [to me was], "You've got to increase your team visibility so that you can do better in the stack rank meeting."

The what-rank? She said it slower as if it would help me to divine what the heck she was talking about. Then she got up and gave me the stack rank lesson and I got to learn about how the team is divided into columns of high, medium, and low folks and then each column has a person by person relative ranking, all those positions negotiated by the leads putting their people up on the whiteboard and then arguing the merits of which report belongs above which other reports.

She said they set the context of their decisions by asking a question like, "Okay, if the team were on a sinking boat and we had to decide who we would put on the life-boats, who would it be?" Up to that point, my ass was next in line for the boat but still going down with the ship.”

From 2004 – mostly thanks to this blog post – stack ranking was no longer a secret, but it wasn’t until 2011 that then-CEO Stever Ballmer acknowledged its existence in an internal email, writing:

“Each rating at each level will now have set compensation tied to the rating.

These ratings will be based on the results you accomplished during the review period (assessed against your commitments), how you accomplished them, and your proven capability. Ratings will be a simple 1-5 system with relative performance being assessed across common peer groups.”

The buckets were pre-defined, supposedly as 20% (top performers), 20% (good performers), 40% (average), 13% (below average), and 7% (poor performers).

I worked at Microsoft starting in 2012, the year after the existence of the stack ranking system became public knowledge. Knowing the distribution made me hope for a grade of 1-2, which would have meant my manager saw me as the “top 40%” within the team. I ended up getting a “3” in 2013, which I was disappointed with, as I interpreted it as being in the bottom 20th to 60th percentile.

Later, I talked with a HR person, who told me that nobody at Microsoft was ever happy with their grades:

Those getting a “1” (the highest rating for the top 20%) think they deserve it. Many feel entitled to it, more than they’re happy to achieve it

Everyone on a 2 to a 5 is disappointed to different extents

Stever Ballmer’s departure spelt the end of the stack ranking system. Shortly after Ballmer announced his retirement in August 2013, the company announced the system was also being retired, effective immediately in November 2013. There are a few possible reasons why Stack Ranking went extinct:

1. Office politics ruled Microsoft. From the mid-2000s, it was increasingly clear that internal politics was more important than building products customers loved.

Microsoft was blindsided by the 2007 launch of the iPhone, and the launch of Android the next year. It took three more years to finally launch a competitive device – the Windows Phone in 2011. By then, iPhone and Android had captured the bulk of the smartphone market.

In 2011, Google software engineer and cartoonist Manu Cornet drew a cartoon about how he perceived Amazon, Google, Facebook, Microsoft, Apple, and Oracle. This was what how he represented the Xbox maker:

Source: Manu Cornet

This image went viral, even though Manu never intended it as a big joke in his comic strip, as he explains in The Man Behind the Tech Comics. The intended target of the joke was Oracle, but his image of Microsoft captured a widely perceived truth.

Back then, there was close to zero collaboration between divisions at Microsoft, which were thousands of employees in size; like Windows Office, Server, Xbox, and Skype. I experienced the lack of collaboration – to the point of hostility – first-hand. In late 2013, my team was building Skype for Web, which we positioned as a competitor to Google Hangouts. We had a problem, though: in order to start a video or voice call, users needed to download a plugin which contained the required video codecs. We noticed Google Hangouts did the same on Internet Explorer and Firefox, but not on Chrome because the plugin was bundled with the browser for a frictionless experience.

My team decided we had to offer the same frictionless experience on Microsoft’s latest browser, Edge, which was in development at the time. After weeks of back-and-forth, the team politely and firmly rejected bundling our plugin into the new Microsoft browser. The reason? Their KPI was to minimize the download size of the browser, and helping us would not help them reach that goal.

It was a maddening experience. Microsoft could not compete with the likes of Google due to internal dysfunction like this; with teams and individuals focused on their own targets at the expense of the greater good for the company and users.

2. Stack ranking pinpointed as the core of the problem. In 2012, Vanity Fair published Microsoft’s lost decade, which said:

“At the center of the cultural problems was a management system called “stack ranking.” Every current and former Microsoft employee I interviewed—every one—cited stack ranking as the most destructive process inside of Microsoft, something that drove out untold numbers of employees. (...)

The system—also referred to as “the performance model,” “the bell curve,” or just “the employee review”—has, with certain variations over the years, worked like this: every unit was forced to declare a certain percentage of employees as top performers, then good performers, then average, then below average, then poor.

In the end, the stack-ranking system crippled the ability to innovate at Microsoft, executives said. “I wanted to build a team of people who would work together and whose only focus would be on making great software,” said Bill Hill, the former manager. “But you can’t do that at Microsoft.”

3. Investor and board pressure. By 2013, Microsoft’s stock had been flat for about 12 years. It was clear that cultural change was needed to turn business performance around, and removing the hated stack ranking system was one of the easiest ways for the leadership team to show that change was afoot.

4. Ballmer’s exit. Several leaders including Head of HR Lisa Brummel were never in favor of stack ranking, as Business Insider reported at the time. With Ballmer gone, executives could push decisions that would’ve previously been vetoed, before a new CEO took the helm.

Satya Nadella replaced stack ranking with a more collaborative performance review system. As CEO, he recognized the cultural problems Microsoft had. In his 2017 book, Hit Refresh, he recalled the pre-2014 times:

"Innovation was being replaced by bureaucracy. Teamwork was being replaced by internal politics. We were falling behind."

A new performance review system attempted to address the problems, rating employees in three areas:

Individual impact

Contributing to others’ success

Leveraging the work of others

Microsoft also got rid of its vitality curve (the stack ranking system), starting from 2014. The changes resulted in a different performance review process, where individual impact carried less weight. In 2022, Microsoft even started to measure how many of its employees said they were “thriving”, which it defined as being “energized and empowered to do meaningful work.” Note that this was at the peak of the hottest job market in tech, when attrition spiked across the sector, and even Big Tech needed new ways to retain people.

Signs that performance management was changing again were visible in 2023, when last September, Microsoft quietly introduced a new field for managers called “impact designators.” They had to rate the impact of their reports and not disclose this to employees. The ratings determined bonuses and pay rises.

As a former engineering manager, what surprised me about this lowkey change was not that it happened, but rather that it raises the question of what Microsoft was doing before? “Impact designator” is another name for “multiplier”, used in most tech workplaces. Ahead of performance calibration meetings, managers often know this information and must fit the budget, or can sometimes exceed it. Multipliers are finalized in the calibration which helps for dividing bonus pots, equity refresh, and pay rise budgets.

So it was a surprise to learn Microsoft operated without managers setting or recommending multipliers for nine years, as part of the performance process.

4. Even without stack ranking, there’s still bucketing

The demise of divisive stack ranking was cheered; but in reality, all larger companies still operate ranking frameworks today. At most mid-sized-and-above companies, performance review processes have the explicit goal to identify and reward top performers, and to find low performers and figure out what to do next. We cover the dynamics of the process in a two-part deepdive. Performance calibrations at tech companies, including:

Read more


Ben Werdmüller

Releasing all my text under a Creative Commons license

From time to time, people ask me if they can use the content of my posts in another venue or another format. To make that possible, today I released all the text of all my posts under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license. You’ll see that declaration on the footer of every page of my site. What does that mean? I’m releasing the text of each post, no

From time to time, people ask me if they can use the content of my posts in another venue or another format. To make that possible, today I released all the text of all my posts under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license. You’ll see that declaration on the footer of every page of my site.

What does that mean?

I’m releasing the text of each post, not the images. That’s because I license most of the images myself and don’t have the legal right to re-license them. You can copy and redistribute the material in any medium or format. Want to take the text and put it in a course, or another website, or an app, or a comic book? Knock yourself out. You can adapt the material. If you want to translate it, illustrate it, build on it — go for it. You can’t use it for commercial purposes under this license. I’m very open to my content being used for commercial purposes, but we need to work out the terms together. You need to attribute the original work. That means identifying me as the original author, linking to the original post or page, and clearly indicating if changes were made. You’ve got to release your version under the same license. If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original. You can’t add restrictions that stop people from doing any of the above things that are permitted by the license.

Take a look at the full license text for the complete picture. In particular, note the following:

No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.

Lastly: it’s optional, but if you do use the content, I’d love to hear about it. You can always email me at ben@werd.io.


A Few Eggs of Advice in These Trying Times

[Oneofthelibrarians at LibrarianShipwreck] Some advice about how to survive this era from LibrarianShipwreck, one of my most favorite blogs on the planet: "So, uh, it’s pretty bad out there! You are probably trying to figure out what the hell to do about it. Here are some words of advice, from wisdom gained through a couple decades in the trenches. Hope it helps." Some of

[Oneofthelibrarians at LibrarianShipwreck]

Some advice about how to survive this era from LibrarianShipwreck, one of my most favorite blogs on the planet:

"So, uh, it’s pretty bad out there! You are probably trying to figure out what the hell to do about it. Here are some words of advice, from wisdom gained through a couple decades in the trenches. Hope it helps."

Some of these are things that I am very bad at, including prioritizing physical health / ability. I think there's also a lot to say in favor of this:

"The Western, and especially USAian, mythos of the singularly special hero is a load of hooey. Don’t fall into that trap. Even when we occasionally do have individuals who make an outsized difference, if you need the thoughts in this post you are almost certainly not positioned to be that person. And that’s ok!"

What these times need, in other words, is co-operation, mutual aid, community, and allyship. American culture, as the piece says, is oriented around rugged individualism; while we all have individual rights, including the right to self-identity, the right to safety, the right to freedom of speech and thought, and so on, it's community that will set us free.

Jerry Springer was a cultural grenade who in some ways paved the path to where we are, but he got one thing right: he signed off every day with the mantra, "take care of yourselves, and each other." That's the spirit.

#Democracy

[Link]

Monday, 10. February 2025

Ben Werdmüller

The NSA's "Big Delete"

[Judd Legum and Rebecca Crosby in Popular Information] The removal of banned terms on both internal and external government websites is going more stupidly than one might have expected: "One example included a job listing page for the Department of Homeland Security that removed language about maintaining an “inclusive environment.” The Post also found examples of words bein

[Judd Legum and Rebecca Crosby in Popular Information]

The removal of banned terms on both internal and external government websites is going more stupidly than one might have expected:

"One example included a job listing page for the Department of Homeland Security that removed language about maintaining an “inclusive environment.” The Post also found examples of words being removed that had nothing to do with DEI, such as a page on the Department of the Interior’s website that boasted of its museums' “diverse collections,” removing the word “diverse.”"

And:

"The memo acknowledges that the list includes many terms that are used by the NSA in contexts that have nothing to do with DEI. For example, the term "privilege" is used by the NSA in the context of "privilege escalation." In the intelligence world, privilege escalation refers to "techniques that adversaries use to gain higher-level permissions on a system or network.""

The whole enterprise is silly, of course, but this is an incredibly bad way to go about it. Words have meaning, and sometimes you need to use them. A global search and replace isn't a perfect way to revamp the whole apparatus of federal government.

#Democracy

[Link]


Doc Searls Weblog

On Regrets and Their Absence

Somebody just gave the 31st upvote to my answer to the Quora question “What do you regret as you get older?” So I thought it might be worth repeating. Here’s a short list: Not learning at least one other language. Not learning a musical instrument (or to sing) well enough to play in a group. […]

Somebody just gave the 31st upvote to my answer to the Quora questionWhat do you regret as you get older?” So I thought it might be worth repeating.

Here’s a short list:

Not learning at least one other language. Not learning a musical instrument (or to sing) well enough to play in a group. Not getting fit physically and staying that way. (Related: not working hard to succeed in at least one varsity sport—especially one requiring teamwork.) Not spending more time in other countries and cultures when I was young. Getting married too early and often. (Since you’re asking, twice before a great marriage now going on 27 years.)

For a full perspective on those, here’s a list of things I don’t regret:

Staying curious about everything, all my life. (“Learn as if you’ll live forever, live as if you’ll die tomorrow,” Gandhi is said to have said.) Learning to ski (in my 40s) and at learning to play soccer and basketball well enough not to be chosen last in pickup games. Sucking in school, which forced me to learn in other ways—and to learn well enough to at least play a decent academic later in life. Never taking up smoking, drugs or drinking heavily. Watching so little TV that I miss most cultural references to various stars and shows.

There are many more possibilities for both lists, but that’s a start.

That was seven years ago. My wife and I have now been married for almost 34 years.


Ben Werdmüller

If MSN comments reveal the soul of America, we're done

For a while now, I’ve been syndicating my posts to MSN. You can see Werd I/O’s profile over there. In some ways, this is my normiest network: whereas my Mastodon community is more technical, my Bluesky community is more political and my newsletter subscribers tend to be a mix of people from the tech and media worlds alongside people I otherwise know, MSN encompasses Windows users who the algorit

For a while now, I’ve been syndicating my posts to MSN. You can see Werd I/O’s profile over there. In some ways, this is my normiest network: whereas my Mastodon community is more technical, my Bluesky community is more political and my newsletter subscribers tend to be a mix of people from the tech and media worlds alongside people I otherwise know, MSN encompasses Windows users who the algorithm thinks should be sent my stuff.

The comments have long fascinated me: they’re incredibly right-wing. I’d initially dismissed them as being part of some influence campaign on the network, but I now see them as an important barometer of a cross-section of what the American public thinks. It’s not good news.

For example, here’s a selection of comments on the MSN version of my link blog post for The 19th’s article about USAID’s lifesaving reproductive healthcare. There’s a lot of this kind of thing:

“Women need to be responsible for their own behaviors. If they become pregnant then they need to seek and pay for their care to ensure the baby is born healthy. Just another waste of taxpayer money.”

And:

“It takes two to tango, where are all these dead beat dads? Why is the American taxpayer responsible for the entire planet? Have any of you women ever heard the word no? Not in your language? Then cross your legs. MSN doesn't like the truth. Communist sensors.”

And, bafflingly:

“How do contraceptives prevent STDs and HIV? They don’t.”

And the absolutely nihilistic but also inherently counterproductive:

“worlds overpopulated as it is.”

As well as the top-rated comment at the time of writing:

“USAID has only used a small portion of the funds for humanitarian purposes. The vast majority has been used for crazy liberal agendas that have nothing to do with humanitarian purposes. Corrupt Democrats have been caught red handed that's why they are trying to cover up what the taxpayers' funds have really been used for.”

My fear is that this is America. These comments are ill-informed, occasionally wildly racist, and light years away from the debate I’d expect to have in other forums. It’s easy to dismiss most of these people as being idiots (something I can’t easily avoid). There are almost no tolerant or left-wing voices in the mix; instead, we’re left with the kind of rhetoric you might otherwise expect to see in communities that have dismissed Fox News as being too soft.

If I’m right, which I’d prefer not to be, it doesn’t say great things about our prospects over the next four years, or for the future of the country. If this is where normie discourse is at, it’s going to be rough.

Anyway, I’ll leave you with two more comments, from other posts:

“Thank you President Trump for putting America and Americans first. When the far left crooks scream loud we know we are on target. FEAR !”

And:

“what maga when both parties just care more about a foreign country while democrats just engage in h ate speech toward the majority and republicans dont care and wont call them ra cists as they are being called that for everything.”

Oof.


Judith Butler, philosopher: ‘If you sacrifice a minority like trans people, you are operating within a fascist logic’

[Iker Seisdedos interviewing Judith Butler in EL PAÍS English] Judith Butler is as on-point as ever: "Q. It wasn’t just Trumpism. Some Democratic voices say it’s time to move beyond the issue of trans rights in areas like sports, which affect very few people. A. You could say that about the Jews, Black people or Haitians, or any very vulnerable minority. Once you decide tha

[Iker Seisdedos interviewing Judith Butler in EL PAÍS English]

Judith Butler is as on-point as ever:

"Q. It wasn’t just Trumpism. Some Democratic voices say it’s time to move beyond the issue of trans rights in areas like sports, which affect very few people.

A. You could say that about the Jews, Black people or Haitians, or any very vulnerable minority. Once you decide that a single vulnerable minority can be sacrificed, you’re operating within a fascist logic, because that means there might be a second one you’re willing to sacrifice, and a third, a fourth, and then what happens?"

This is exactly it. I've also heard voices say that there should have been less discussion of racial equity: less Black Lives Matter, less 1619 Project, less discussion of systemic inequality. It's nonsense, and as Butler says, it's a road that leads us down an inevitably fascist path.

The whole interview is very much worth your time: nuanced and well-considered.

#Democracy

[Link]


@_Nat Zone

2月26日 PwC Digital Identity Forum 2025に出演します〜「オープンファイナンスの未来とデジタルアイデンティティの重要性」

2月26日 PwC Digital Identity Forum 2025に出演します。 開催日時 2025年2月26日(水)15:00~19:00(受付開始14:45) 申し込み締め切り 2025年2月25日(火)12:00 場所 東京国際フォーラム ホールB5(東京…

2月26日 PwC Digital Identity Forum 2025に出演します。

開催日時2025年2月26日(水)15:00~19:00(受付開始14:45)申し込み締め切り2025年2月25日(火)12:00場所東京国際フォーラム ホールB5
(東京都千代田区丸の内3丁目5番1号Bブロック5階)

私の出番はセッション1 (15:15 – 15:55) です。

Session1:オープンファイナンスの未来とデジタルアイデンティティの重要性

多くの企業が金融サービスに関わるようになってきた昨今、日本におけるフィンテックが持つ変革力と、金融取引をより信頼できるものにするためにデジタルアイデンティティが果たす重要な役割を探ります。グローバルの事例を参考にしながら、セキュリティとオープン性が複雑に絡み合うフィンテックの世界をどのように実現できるかについての理解を再定義します。

スピーカー

私以外の方々が錚々たるもので、私はいなくても良いのではないか疑惑もありますが、ぜひお越しください。

ちなみにこの後のセッションはデジタル庁の林さん (@lef) とNTTドコモの森山さん、それにPwCの柴田さんです。

16:10-16:50 Session2:ビジネス戦略で重要性を増す本人確認――企業を支えるデジタル社会の新基準

デジタル社会の進化に伴い、なりすましなどの不正利用防止や未成年保護のための年齢確認など、あらゆる企業にとって本人確認の役割はますます大きくなっています。
本セッションでは、普及が進むマイナンバーカードやパスキーといった最新の本人確認の仕組みとその応用について、業界の第一線で活躍する専門家が多角的に解説します。
企業活動における本人確認の価値やその実践的な活用方法を、具体的な事例と最新の動向を交えて探ります。

スピーカー

申し込みやその他のプログラムは、公式サイトよりご覧になっていただけます。

それでは現地でお目にかかりましょう。


Damien Bod

ASP.NET Core delegated OAuth Token Exchange access token management

This blog shows how to implement a delegated OAuth 2.0 Token Exchange RFC 8693 flow in ASP.NET Core, and has a focus on access token management. It looks at how the OAuth Token Exchange can be implemented and how an application can request delegated access tokens on behalf of a user and another application, providing […]

This blog shows how to implement a delegated OAuth 2.0 Token Exchange RFC 8693 flow in ASP.NET Core, and has a focus on access token management. It looks at how the OAuth Token Exchange can be implemented and how an application can request delegated access tokens on behalf of a user and another application, providing a seamless and secure access to protected resources using a zero trust strategy.

Code: https://github.com/damienbod/token-mgmt-ui-delegated-token-exchange

Blogs in this series ASP.NET Core user delegated access token management ASP.NET Core user application access token management ASP.NET Core delegated OAuth 2.0 Token Exchange access token management Setup

The solution implements an ASP.NET Core web application which authenticates using Microsoft Entra ID. The web application uses an API protected with a Microsoft Entra ID access token. This API uses another downstream API protected with Duende IdentityServer. The API exchanges the Microsoft Entra ID access token for a new Duende IdentityServer access token using the OAuth 2.0 Token Exchange standard. Both APIs use a user delegated access token. The tokens are persisted on the trusted backend using the IDistributedCache implementation. This can be an in-memory cache or a persistent cache. When using this cache, it is important to automatically renew the access token, if it is missing or invalid.

What must an application manage?

An access token management solution must ensure that tokens are securely stored per user session for delegated downstream API user tokens and updated after each UI authentication or refresh. The solution should be robust to handle token expiration, function seamlessly after restarts, and support multi-instance deployments. The tokens must be persisted safely in multiple instance setups. Additionally, it must effectively manage scenarios involving invalid or missing access tokens.

Properties of token management in the solution setup: The access token is persisted per user session The token expires The token needs to be persisted somewhere safely (Safe and encrypted storage if not in-memory) The token must be replaced after each UI authentication (per user) The solution must work after restarts The solution must work for multiple instances when deployed to multi-instance deployments. The solution must handle invalid access tokens or missing access tokens The application must handle a user logout Client Implementation (Entra ID API)

An OAuth 2.0 Token Exchange token request is sent to the Duende IdentityServer using the ApiTokenCacheClient. The service persists the token in a cache per user. The cache is implemented using the IDistributedCache interface.

using IdentityModel.Client; using IdentityModel; using Microsoft.Extensions.Caching.Distributed; using Microsoft.Extensions.Options; using System.Text.Json; namespace WebApiEntraId.WebApiDuende; public class ApiTokenCacheClient { private readonly ILogger<ApiTokenCacheClient> _logger; private readonly IHttpClientFactory _httpClientFactory; private readonly IOptions<WebApiDuendeConfig> _webApiDuendeConfig; private static readonly object _lock = new(); private readonly IDistributedCache _cache; private const int cacheExpirationInDays = 1; private class AccessTokenItem { public string AccessToken { get; set; } = string.Empty; public DateTime ExpiresIn { get; set; } } public ApiTokenCacheClient( IOptions<WebApiDuendeConfig> webApiDuendeConfig, IHttpClientFactory httpClientFactory, ILoggerFactory loggerFactory, IDistributedCache cache) { _webApiDuendeConfig = webApiDuendeConfig; _httpClientFactory = httpClientFactory; _logger = loggerFactory.CreateLogger<ApiTokenCacheClient>(); _cache = cache; } public async Task<string> GetApiTokenOauthGrantTokenExchange( string clientId, string audience, string scope, string clientSecret, string aadAccessToken) { var accessToken = GetFromCache(clientId); if (accessToken != null) { if (accessToken.ExpiresIn > DateTime.UtcNow) { return accessToken.AccessToken; } } _logger.LogDebug("GetApiToken new from STS for {api_name}", clientId); // add var newAccessToken = await GetApiTokenOauthGrantTokenExchangeAad( clientId, audience, scope, clientSecret, aadAccessToken); AddToCache(clientId, newAccessToken); return newAccessToken.AccessToken; } private async Task<AccessTokenItem> GetApiTokenOauthGrantTokenExchangeAad(string clientId, string audience, string scope, string clientSecret, string entraIdAccessToken) { var tokenExchangeHttpClient = _httpClientFactory.CreateClient(); tokenExchangeHttpClient.BaseAddress = new Uri(_webApiDuendeConfig.Value.IdentityProviderUrl); var cache = new DiscoveryCache(_webApiDuendeConfig.Value.IdentityProviderUrl); var disco = await cache.GetAsync(); var tokenExchangeSuccessResponse = await tokenExchangeHttpClient .RequestTokenExchangeTokenAsync(new TokenExchangeTokenRequest { Address = disco.TokenEndpoint, ClientId = clientId, ClientSecret = clientSecret, Audience = audience, SubjectToken = entraIdAccessToken, SubjectTokenType = OidcConstants.TokenTypeIdentifiers.AccessToken, Scope = scope, Parameters = { { "exchange_style", "delegation" } } }); if (tokenExchangeSuccessResponse != null) { return new AccessTokenItem { ExpiresIn = DateTime.UtcNow.AddSeconds(tokenExchangeSuccessResponse.ExpiresIn), AccessToken = tokenExchangeSuccessResponse.AccessToken! }; } _logger.LogError("no success response from oauth token exchange access token request"); throw new ApplicationException("no success response from oauth token exchange access token request"); } private void AddToCache(string key, AccessTokenItem accessTokenItem) { var options = new DistributedCacheEntryOptions() .SetSlidingExpiration(TimeSpan.FromDays(cacheExpirationInDays)); lock (_lock) { _cache.SetString(key, JsonSerializer.Serialize(accessTokenItem), options); } } private AccessTokenItem? GetFromCache(string key) { var item = _cache.GetString(key); if (item != null) { return JsonSerializer.Deserialize<AccessTokenItem>(item); } return null; } }

The WebApiDuendeService class uses the token API service to request data from the downstream API.

using IdentityModel.Client; using Microsoft.Extensions.Options; using System.Text.Json; namespace WebApiEntraId.WebApiDuende; public class WebApiDuendeService { private readonly IOptions<WebApiDuendeConfig> _webApiDuendeConfig; private readonly IHttpClientFactory _clientFactory; private readonly ApiTokenCacheClient _apiTokenClient; public WebApiDuendeService( IOptions<WebApiDuendeConfig> webApiDuendeConfig, IHttpClientFactory clientFactory, ApiTokenCacheClient apiTokenClient) { _webApiDuendeConfig = webApiDuendeConfig; _clientFactory = clientFactory; _apiTokenClient = apiTokenClient; } public async Task<string> GetWebApiDuendeDataAsync(string entraIdAccessToken) { try { var client = _clientFactory.CreateClient(); client.BaseAddress = new Uri(_webApiDuendeConfig.Value.ApiBaseAddress); var accessToken = await _apiTokenClient.GetApiTokenOauthGrantTokenExchange ( _webApiDuendeConfig.Value.ClientId, _webApiDuendeConfig.Value.Audience, _webApiDuendeConfig.Value.ScopeForAccessToken, _webApiDuendeConfig.Value.ClientSecret, entraIdAccessToken ); client.SetBearerToken(accessToken); var response = await client.GetAsync("api/profiles/photo"); if (response.IsSuccessStatusCode) { var data = await response.Content.ReadAsStringAsync(); if (data != null) { return data; } return string.Empty; } throw new ApplicationException($"Status code: {response.StatusCode}, Error: {response.ReasonPhrase}"); } catch (Exception e) { throw new ApplicationException($"Exception {e}"); } } }

Duende IdentityServer implementation

Duende IdentityServer provides an IExtensionGrantValidator interface to implement the identity server support for OAuth 2.0 Token Exchange standard. This service must validate the access token and provide the data to issue a new Duende access token. Other validation checks are required like validating the sub claim which represents the user in the delegated access token. It is important to validate the access token fully. The new access tokens should only be issued for the same user. It is important to use a unique identifier from the access token to read data and issue new data for the user. An email is normally not a good solution for this as users can change their email in some IAM solutions.

public class TokenExchangeGrantValidator : IExtensionGrantValidator { private readonly ITokenValidator _validator; private readonly OauthTokenExchangeConfiguration _oauthTokenExchangeConfiguration; private readonly UserManager<ApplicationUser> _userManager; public TokenExchangeGrantValidator(ITokenValidator validator, IOptions<OauthTokenExchangeConfiguration> oauthTokenExchangeConfiguration, UserManager<ApplicationUser> userManager) { _validator = validator; _oauthTokenExchangeConfiguration = oauthTokenExchangeConfiguration.Value; _userManager = userManager; } public async Task ValidateAsync(ExtensionGrantValidationContext context) { // defaults context.Result = new GrantValidationResult(TokenRequestErrors.InvalidRequest); var customResponse = new Dictionary<string, object> { {OidcConstants.TokenResponse.IssuedTokenType, OidcConstants.TokenTypeIdentifiers.AccessToken} }; var subjectToken = context.Request.Raw.Get(OidcConstants.TokenRequest.SubjectToken); var subjectTokenType = context.Request.Raw.Get(OidcConstants.TokenRequest.SubjectTokenType); var oauthTokenExchangePayload = new OauthTokenExchangePayload { subject_token = subjectToken!, subject_token_type = subjectTokenType!, audience = context.Request.Raw.Get(OidcConstants.TokenRequest.Audience), grant_type = context.Request.Raw.Get(OidcConstants.TokenRequest.GrantType)!, scope = context.Request.Raw.Get(OidcConstants.TokenRequest.Scope), }; // mandatory parameters if (string.IsNullOrWhiteSpace(subjectToken)) { return; } if (!string.Equals(subjectTokenType, OidcConstants.TokenTypeIdentifiers.AccessToken)) { return; } var (Valid, Reason, Error) = ValidateOauthTokenExchangeRequestPayload .IsValid(oauthTokenExchangePayload, _oauthTokenExchangeConfiguration); if (!Valid) { return; // UnauthorizedValidationParametersFailed(oauthTokenExchangePayload, Reason, Error); } // get well known endpoints and validate access token sent in the assertion var configurationManager = new ConfigurationManager<OpenIdConnectConfiguration>( _oauthTokenExchangeConfiguration.AccessTokenMetadataAddress, new OpenIdConnectConfigurationRetriever()); var wellKnownEndpoints = await configurationManager.GetConfigurationAsync(); var accessTokenValidationResult = await ValidateOauthTokenExchangeRequestPayload.ValidateTokenAndSignature( subjectToken, _oauthTokenExchangeConfiguration, wellKnownEndpoints.SigningKeys); if (!accessTokenValidationResult.Valid) { return; // UnauthorizedValidationTokenAndSignatureFailed(oauthTokenExchangePayload, accessTokenValidationResult); } // get claims from Microsoft Entra ID token and re use in Duende IDP token var claimsIdentity = accessTokenValidationResult.ClaimsIdentity; if (claimsIdentity == null) { return; } var isDelegatedToken = ValidateOauthTokenExchangeRequestPayload .IsDelegatedAadAccessToken(claimsIdentity); if (!isDelegatedToken) { return; // UnauthorizedValidationRequireDelegatedTokenFailed(); } var name = ValidateOauthTokenExchangeRequestPayload.GetPreferredUserName(claimsIdentity); var isNameAndEmail = ValidateOauthTokenExchangeRequestPayload.IsEmailValid(name); if (!isNameAndEmail) { return; // UnauthorizedValidationPreferredUserNameFailed(); } // Should use the OID var user = await _userManager.FindByNameAsync(name); if (user == null) { return; // UnauthorizedValidationNoUserExistsFailed(); } var sub = claimsIdentity.Claims!.First(c => c.Type == JwtClaimTypes.Subject).Value; var style = context.Request.Raw.Get("exchange_style"); if (style == "impersonation") { // set token client_id to original id context.Request.ClientId = oauthTokenExchangePayload.audience!; context.Result = new GrantValidationResult( subject: sub, authenticationMethod: GrantType, customResponse: customResponse); } else if (style == "delegation") { // set token client_id to original id context.Request.ClientId = oauthTokenExchangePayload.audience!; var actor = new { client_id = context.Request.Client.ClientId }; var actClaim = new Claim(JwtClaimTypes.Actor, JsonSerializer.Serialize(actor), IdentityServerConstants.ClaimValueTypes.Json); context.Result = new GrantValidationResult( subject: sub, authenticationMethod: GrantType, claims: [actClaim], customResponse: customResponse); } else if (style == "custom") { context.Result = new GrantValidationResult( subject: sub, authenticationMethod: GrantType, customResponse: customResponse); } } public string GrantType => OidcConstants.GrantTypes.TokenExchange; }

In Duende a client is required to support the OAuth 2.0 Token Exchange. This is added using the AllowedGrantTypes property. A secret is also required to acquire a new access token.

new Client { ClientId = "tokenexchangeclientid", ClientSecrets = { new Secret("--in-user-secrets--".Sha256()) }, AllowedGrantTypes = { OidcConstants.GrantTypes.TokenExchange }, AllowedScopes = { "shopclientscope" } }

Support for the OAuth Token Exchange is added to the Duende IdentityServer setup using the AddExtensionGrantValidator extension method.

var idsvrBuilder = builder.Services .AddIdentityServer(options => { options.Events.RaiseErrorEvents = true; options.Events.RaiseInformationEvents = true; options.Events.RaiseFailureEvents = true; options.Events.RaiseSuccessEvents = true; // see https://docs.duendesoftware.com/identityserver/v6/fundamentals/resources/ options.EmitStaticAudienceClaim = true; }) .AddInMemoryIdentityResources(Config.IdentityResources) .AddInMemoryApiScopes(Config.ApiScopes) .AddInMemoryClients(Config.Clients()) .AddAspNetIdentity<ApplicationUser>(); // registers extension grant validator for the token exchange grant type idsvrBuilder.AddExtensionGrantValidator<TokenExchangeGrantValidator>();

Running the solutions

When all four applications are started, the data from the Duende protected API is returned to the Razor Page application which uses Microsoft Entra ID to authenticate.

Links
Implement the OAUTH 2.0 Token Exchange delegated flow between an Microsoft Entra ID API and an API protected using OpenIddict

https://github.com/damienbod/OAuthGrantExchangeOidcDownstreamApi

https://docs.duendesoftware.com/identityserver/v7/tokens/extension_grants/token_exchange/

JSON Web Token (JWT)

Best Current Practice for OAuth 2.0 Security

The OAuth 2.0 Authorization Framework

OAuth 2.0 Demonstrating Proof of Possession DPoP

OAuth 2.0 JWT-Secured Authorization Request (JAR) RFC 9101

OAuth 2.0 Mutual-TLS Client Authentication and Certificate-Bound Access Tokens

OpenID Connect 1.0

Microsoft identity platform and OAuth 2.0 On-Behalf-Of flow

OAuth 2.0 Token Exchange

JSON Web Token (JWT) Profile for OAuth 2.0 Access Tokens

HTTP Semantics RFC 9110

https://learn.microsoft.com/en-us/aspnet/core/security/authentication/social/additional-claims

https://github.com/dotnet/aspnetcore/issues/8175


Simon Willison

Cerebras brings instant inference to Mistral Le Chat

Cerebras brings instant inference to Mistral Le Chat Mistral announced a major upgrade to their Le Chat web UI (their version of ChatGPT) a few days ago, and one of the signature features was performance. It turns out that performance boost comes from hosting their model on Cerebras: We are excited to bring our technology to Mistral – specifically the flagship 123B parameter Mistral Large 2

Cerebras brings instant inference to Mistral Le Chat

Mistral announced a major upgrade to their Le Chat web UI (their version of ChatGPT) a few days ago, and one of the signature features was performance.

It turns out that performance boost comes from hosting their model on Cerebras:

We are excited to bring our technology to Mistral – specifically the flagship 123B parameter Mistral Large 2 model. Using our Wafer Scale Engine technology, we achieve over 1,100 tokens per second on text queries.

Given Cerebras's so far unrivaled inference performance I'm surprised that no other AI lab has formed a partnership like this already.

Tags: mistral, generative-ai, cerebras, ai, llms

Sunday, 09. February 2025

Doc Searls Weblog

Fire and Rain

Twenty-fifth in the News Commons series Southern California has two seasons: Fire and Rain. Rain didn’t begin this year until a few days after Fire ended apocalyptically, incinerating much of Altadena and Pacific Palisades. Now Rain is here, with the occasional atmospheric river flowing across the faces of hills and mountains whose beards were just […]

Twenty-fifth in the News Commons series

This is etched in limestone over an entrance to a building at the University of Wyoming. It gave John McPhee the title for perhaps its most useful book.

Southern California has two seasons: Fire and Rain.

Rain didn’t begin this year until a few days after Fire ended apocalyptically, incinerating much of Altadena and Pacific Palisades. Now Rain is here, with the occasional atmospheric river flowing across the faces of hills and mountains whose beards were just burned off.

Much is being  written about what geologists call debris flows and Muggles call mudslides. These tend to happen when mountains of poorly bound slopes of loose rock and soil get drenched in rain after the vegetation holding everything in place has been burned away. Many of the big slopes above Los Angeles are mile-wide dump trucks waiting to hoist their beds and open their tailgates. The Los Angeles County Flood Control District (aka “Flood”) has been on this case for a century and stands ready with debris basins and other risk abatements. But the San Gabriel Mountains are much bigger, and resemble volcanoes in their tectonic youth and tendency to erode downward and outward almost as fast as earthquakes push them up.

For describing all this, no writing is more useful than John McPhee‘s Los Angeles Against The Mountains, which ran first as a pair of long essays in September and October 1988 editions of The New Yorker, and later as a chapter in The Control of Nature, which came out in 1990. (Another chapter in that book is “The Army Corps of Engineers vs. the Mississippi,” which first appeared as Achafalya in a 1987 issue of The New Yorker. The magazine published it again after Hurricane Katrina as if it were a live report on what happened when the levees failed in New Orleans—though it had been written, prophetically, seventeen years earlier.)

McPhee is the Beethoven, the Rembrandt, the Shakespeare of nonfiction: inimitable and without equal. At ninety-three he is still writing and teaching other writers at Princeton. Though I’ve never set foot in Princeton, McPhee has been teaching me how to write for many decades.  I have read all his books, some many times. A few have so many page flags I could brush off a table with them. Some of those flags now aid my compression of  “Los Angeles Against the Mountains.”

Here he frames the battle:

…in the confrontation of the urban and the wild. Los Angeles is overmatched on one side by the Pacific Ocean and on the other by very high mountains. With respect to these principal boundaries, Los Angeles is done sprawling. The San Gabriels, in their state of tectonic youth, are rising as rapidly as any range on earth. Their loose inimical slopes flout the tolerance of the angle of repose. Rising straight up out of the megalopolis, they stand ten thousand feet above the nearby sea, and they are not kidding with this city. Shedding, spalling, self-destructing, they are disintegrating at a rate that is also among the fastest in the world…

The slopes average sixty-five to seventy per cent. In numerous places, they are vertical…The term “oversteepened” is often used to describe them. At the giddy extreme of oversteepening is the angle of maximum slope. Very large sections of the San Gabriels closely approach that angle. In such terrain, there is not much to hold the loose material except the plants that grow there…

And here he’s describing the experience of hanging with geologists on the steep face of one slope:

Evergreen oaks were fingering up the creases in the mountainsides, pointing toward the ridgeline forests of big-cone Douglas fir, of knobcone and Coulter pine. The forests had an odd sort of timberline. They went down to it rather than up. Down from the ridges the conifers descended through nine thousand, seven thousand, six thousand feet, stopping roughly at five. The forests abruptly ended—the country below being too dry in summer to sustain tall trees. On down the slopes and all the way to the canyons was a thicket of varied shrubs that changed in character as altitude fell but was everywhere dense enough to stop an army. On its lower levels, it was all green, white, and yellow with buckwheat, burroweed, lotus and sage, deerweed, bindweed, yerba santa… The thicket’s resistance to trespass, while everywhere formidable, stiffened considerably as it evolved upward…Whatever and wherever they were, these plants were prickly, thick, and dry, and a good deal tougher than tundra. Those evergreen oaks fingering up the creases in the mountains were known to the Spaniards as chaparros. Riders who worked in the related landscape wore leather overalls open at the back, and called them chaparajos. By extension, this all but impenetrable brush was known as chaparral.

The low stuff, at the buckwheat level, is often called soft chaparral. Up in the tough chamise, closer to the lofty timber, is high chaparral, which is also called hard chaparral. High or low—hard, soft, or mixed—all chaparral has in common an always developing, relentlessly intensifying, vital necessity to burst into flame. In a sense, chaparral consumes fire no less than fire consumes chaparral. Fire nourishes and rejuvenates the plants. There are seeds that fall into the soil, stay there indefinitely, and will not germinate except in the aftermath of fire. There are basal buds that sprout only after fire. Droughts are so long, rains so brief, that dead bits of wood and leaves scarcely decay. Instead, they accumulate, thicken, until the plant community is all but strangling in its own duff. The nutrients in the dead material are being withheld from the soil. When fire comes, it puts the nutrients back in the ground. It clears the terrain for fresh growth. When chaparral has not been burned for thirty years, about half the thicket will be dry dead stuff—twenty-five thousand tons of it in one square mile. The living plants are no less flammable. The chamise, the manzanita—in fact, most chaparral plants—are full of solvent extractives that burn intensely and ignite easily. Their leaves are glossy with oils and resins that seal in moisture during hot dry periods and serve the dual purpose of responding explosively to flame. In the long dry season, and particularly in the fall, air flows southwest toward Los Angeles from the Colorado Plateau and the Basin and Range. Extremely low in moisture, it comes out of the canyon lands and crosses the Mojave Desert. As it drops in altitude, it compresses, becoming even dryer and hotter. It advances in gusts. This is the wind that is sometimes called the foehn. The fire wind. The devil wind. In Los Angeles, it is known as Santa Ana. When chamise and other chaparral plants sense the presence of Santa Ana winds, their level of moisture drops, and they become even more flammable than they were before. The Santa Anas bring what has been described as “instant critical fire weather.” Temperatures rise above a hundred degrees. Humidity drops very close to zero. According to Charles Colver, of the United States Forest Service, “moisture evaporates off your eyeballs so fast you have to keep blinking.”

About fires, when they happen, he says this:

…Where the Santa Anas collide with local mountain winds, they become so erratic that they can scatter a fire in big flying brands for a long distance in any direction. The frequency and the intensity of the forest fires in the Southern California chaparral are the greatest in the United States, with the possible exception of the wildfires of the New Jersey Pine Barrens.

McPhee’s writing is always wrapped around characters, most of which I’m leaving out of this compression. The first exception for our purpose here is Wade Wells, a hydrologist with the United States Forest Service.

The chaparral fires are considerably more potent than the forest fires Wade Wells saw when he was an undergraduate at the University of Idaho or when he worked as a firefighter in the Pacific Northwest. “Fires in the Pacific Northwest are nothing compared with these chaparral fires,” he remarked. “Chaparral fires are almost vicious by comparison. They’re so intense. Chaparral is one of the most flammable vegetation complexes there are.”

It burns as if it were soaked with gasoline. Chaparral plants typically have multiple stems emerging from a single root crown, and this contributes not only to the density of the thickets but, ultimately, to the surface area of combustible material that stands prepared for flame. Hundreds of acres can be burned clean in minutes. In thick black smoke there is wild orange flame, rising through the canyons like explosion crowns. The canyons serve as chimneys, and in minutes whole mountains are aflame, resembling volcanoes, emitting high columns of fire and smoke…

In millennia before Los Angeles settled its plain, the chaparral burned every thirty years or so, as the chaparral does now. The burns of prehistory, in their natural mosaic, were smaller than the ones today. With cleared fire lanes, chemical retardants, and other means of suppressing what is not beyond control, people have conserved fuel in large acreages. When the inevitable fires come, they burn hotter, higher, faster than they ever did in a state of unhindered nature. When the fires end, there is nothing much left on the mountainsides but a thin blanket of ash. The burns are vast and bare. On the sheer declivities where the surface soils were held by chaparral, there is no chaparral.

Fine material tumbles downslope and collects in the waterless beds of streams. It forms large and bulky cones there, to some extent filling the canyons. Under green chaparral, the gravitational movement of bits of soil, particles of sand, and other loose debris goes on month after month, year after year, especially in oversteepened environments, where it can represent more than half of all erosion. After a burn, though, it increases exponentially. It may increase twentyfold, fortyfold, even sixtyfold. This steady tumbling descent of unconsolidated mountain crumbs is known as dry ravel. After a burn, so much dry ravel and other debris becomes piled up and ready to go that to live under one of those canyons is (as many have said) to look up the barrel of a gun.

One would imagine that the first rain would set the whole thing off, but it doesn’t. The early-winter rains—and sometimes the rains of a whole season—are not enough to make the great bulk move. Actually, they add to it…

In the slow progression of normal decay, chaparral litter seems to give up to the soil what have been vaguely described as “waxlike complexes of long-chain aliphatic hydrocarbons.” These waxy substances are what make unburned chaparral soil somewhat resistant to water, or “slightly nonwettable,” as Wells and his colleagues are wont to describe it. When the wildfires burn, and temperatures at the surface of the ground are six or seven hundred centigrade degrees, the soil is so effective as an insulator that the temperature one centimetre below the surface may not be hot enough to boil water. The heavy waxlike substances vaporize at the surface and recondense in the cooler temperatures below. Acting like oil, they coat soil particles and establish the hydrophobic layer—one to six centimetres down. Above that layer, where the waxlike substances are gone, the veneer of burned soil is “wettable.” When Wells drips water on a dishful of that, the water soaks in as if the dish were full of Kleenex. When rain falls on burned and denuded ground, it soaks the very thin upper layer but can penetrate no farther. Hiking boots strike hard enough to break through into the dust, but the rain is repelled and goes down the slope. Of all the assembling factors that eventually send debris flows rumbling down the canyons, none is more detonative than the waterproof soil.

In the first rains after a fire, water quickly saturates the thin permeable layer, and liquefied soil drips downhill like runs of excess paint. These miniature debris flows stripe the mountainsides with miniature streambeds—countless scarlike rills that are soon the predominant characteristic of the burned terrain. As more rain comes, each rill is going to deliver a little more debris to the accumulating load in the canyon below. But, more to the point, each rill—its natural levees framing its impermeable bed—will increase the speed of the surface water. As rain sheds off a mountainside like water off a tin roof, the rill network, as it is called, may actually cube the speed, and therefore the power, of the runoff. The transport capacity of the watershed—how much bulk it can move—may increase a thousandfold. The rill network is prepared to deliver water with enough force and volume to mobilize the deposits lying in the canyons below. With the appearance of the rills, almost all prerequisites have now sequentially occurred. The muzzle-loader is charged. For a full-scale flat-out debris flow to burst forth from the mountains, the final requirement is a special-intensity storm.

Some of the most concentrated rainfall in the history of the United States has occurred in the San Gabriel Mountains. The oddity of this is about as intense as the rain. Months—seasons—go by in Los Angeles without a fallen drop. Los Angeles is one of the least-rained-upon places in the Western Hemisphere. The mountains are so dry they hum. Erosion by dry ravel greatly exceeds erosion by water. The celebrated Mediterranean climate of Los Angeles owes itself to aridity. While Seattle is receiving its average rainfall of thirty-nine inches a year, Chicago thirty-three, the District of Columbia thirty-nine, and New York City forty-four, Los Angeles is doing well if it gets fifteen. In one year out of every four over the past century, rainfall in Los Angeles has been under ten inches, and once or twice it was around five. That is pure Gobi. When certain storm systems approach Los Angeles, though—storms that come in on a very long reach from far out in the Pacific—they will pick up huge quantities of water from the ocean and just pump it into the mountains…

Here begins a story about one debris flow, in a setting high and dry:

Five miles into the mountains from the edge of the city is a small, obscure, steep-sided watershed of twenty-five hundred acres which is drained by the Middle Fork of Mill Creek, a tributary of the Big Tujunga. The place is so still you can hear the dry ravel…

Then came Fire:

It was a textbook situation—a bowl in the mountains filled with hard chaparral that had not been touched by fire in ninety-nine years. The older chaparral becomes, the hotter it burns. In its first ten years of new growth, it is all but incombustible. After twenty years, its renewed flammability curves sharply upward. It burns, usually, before it is forty years old. The hotter the fire, the more likely a debris flow—and the greater the volume when it comes. The century-old fuel of the Middle Fork was so combustible that afterward there were not even stumps. The slopes looked sandpapered. The streambed, already loaded, piled even higher with dry ravel. The Middle Fire, as the burn was known, was cause for particular alarm, because a small settlement was a mile downstream. Its name—Hidden Springs—contained more prophecy than its residents seemed prepared to imagine. Three hundred and ninety thousand cubic yards of loose debris was gathered just above them, awaiting mobilization.

Then, Rain:

Toward midnight February 9th, an accidental fire broke out in a small building of Gabe Hinterberg’s. A fire truck eventually came. Half a dozen people fought the fire, assisted by the heavy rain. One of them was George Scribner. The five-minute spike of greatest downpour occurred at about one-thirty. Half an hour later, George said, “Hey, we got the fire put out.”

Gabe said, “Good deal.”

And then Gabe and George were dead.

Amos Lewis, nearby, was holding a fire hose in his hand and was attempting to prevent it from kinking. In his concentration, he did not see danger coming. He heard nothing ominous. He only felt the hose draw taut. Through his peripheral vision he became aware that the fire truck—with the hose connected to it—was somehow moving sideways. Seconds later, Amos Lewis, too, was swept away.

The snout of the debris flow was twenty feet high, tapering behind. Debris flows sometimes ooze along, and sometimes move as fast as the fastest river rapids. The huge dark snout was moving nearly five hundred feet a minute and the rest of the flow behind was coming twice as fast, making roll waves as it piled forward against itself—this great slug, as geologists would describe it, this discrete slug, this heaving violence of wet cement. Already included in the debris were propane tanks, outbuildings, picnic tables, canyon live oaks, alders, sycamores, cottonwoods, a Lincoln Continental, an Oldsmobile, and countless boulders five feet thick. All this was spread wide a couple of hundred feet, and as the debris flow went through Hidden Springs it tore out more trees, picked up house trailers and more cars and more boulders, and knocked Gabe Hinterberg’s lodge completely off its foundation. Mary and Cal Drake were standing in their living room when a wall came off. “We got outside somehow,” he said later. “I just got away. She was trying to follow me. Evidently, her feet slipped out from under her. She slid right down into the main channel.” The family next door were picked up and pushed against their own ceiling. Two were carried away. Whole houses were torn loose with people inside them. A house was ripped in half. A bridge was obliterated. A large part of town was carried a mile downstream and buried in the reservoir behind Big Tujunga Dam. Thirteen people were part of the debris. Most of the bodies were never found.

As Amos Lewis suddenly found himself struggling in the viscous flow, he more or less bumped into a whirling pickup, coming down in the debris from who knows where upstream. One of the roll waves picked him up and threw him into the back of the truck. As the vehicle spun around and around, it neared one bank. Lewis saw an overhanging limb. He reached for it, caught it, and pulled himself above the rocky flow. Years later, just about where this had happened, he told Wade Wells and me the story. “I got pushed to one side,” he said as he finished. “I lucked out.” Lewis is a prematurely white-haired man with a white beard and dark-brown eyes. On this day in late spring, his muscular build and deeply tanned skin were amply displayed by a general absence of clothing. He wore bluejean shorts, white socks, mountain boots, and nothing else. When people began to discover human remains in the reservoir, he had gone in his patrol car to investigate the fate of his neighbors. “I had to go roll on them calls,” he said. “A deputy sheriff has to roll on any type of body being found. I carried out at least four, maybe five, skulls.”

These are all things to bear in mind when we visit how news ought to work in a digital environment where facts mostly don’t matter—until they do.

 

 

 


Simon Willison

Quoting Sam Altman

The cost to use a given level of AI falls about 10x every 12 months, and lower prices lead to much more use. You can see this in the token cost from GPT-4 in early 2023 to GPT-4o in mid-2024, where the price per token dropped about 150x in that time period. Moore’s law changed the world at 2x every 18 months; this is unbelievably stronger. — Sam Altman, Three Observations Tags: generati

The cost to use a given level of AI falls about 10x every 12 months, and lower prices lead to much more use. You can see this in the token cost from GPT-4 in early 2023 to GPT-4o in mid-2024, where the price per token dropped about 150x in that time period. Moore’s law changed the world at 2x every 18 months; this is unbelievably stronger.

Sam Altman, Three Observations

Tags: generative-ai, openai, llm-pricing, ai, llms, sam-altman


Ben Werdmüller

USAID’s reproductive health spending has saved millions of lives. Now it’s gone.

[Jessica Kutz at The 19th] USAID's defunding will lead directly to women's deaths: "As of 2023, 67 percent of contraceptives supplied through USAID went to Africa, where some of the leading causes of death for girls and women are related to pregnancy and sexually transmitted infections like HIV. According to an analysis by the Guttmacher Institute, if no contraceptive care i

[Jessica Kutz at The 19th]

USAID's defunding will lead directly to women's deaths:

"As of 2023, 67 percent of contraceptives supplied through USAID went to Africa, where some of the leading causes of death for girls and women are related to pregnancy and sexually transmitted infections like HIV. According to an analysis by the Guttmacher Institute, if no contraceptive care is provided by USAID in 2025, that will lead to about 4.2 million unintended pregnancies and over 8,000 deaths related to pregnancy and childbirth complications."

The article goes on to detail efforts in countries like Afghanistan, Senegal, India, and Nigeria. The idea that we should simply rug-pull these efforts is ludicrous: it sends a clear message that we no longer care about the well-being of people overseas, and that we don't think their quality of life is important to us or affects us. This is an obvious, profound mistake.

#Democracy

[Link]


Donald Trump’s Immigration Executive Orders: Tracking the Most Impactful Changes

[Mica Rosenberg, Perla Trevizo, and Zisiga Mukulu in ProPublica, co-published with The Texas Tribune] This is a beautifully-designed co-production between ProPublica and The Texas Tribune, illustrating the immigration policies that Donald Trump enacted on day one. These encompassed dozens of policies that were revived from his first term, as well as seven new ones that hadn't

[Mica Rosenberg, Perla Trevizo, and Zisiga Mukulu in ProPublica, co-published with The Texas Tribune]

This is a beautifully-designed co-production between ProPublica and The Texas Tribune, illustrating the immigration policies that Donald Trump enacted on day one. These encompassed dozens of policies that were revived from his first term, as well as seven new ones that hadn't been tried before.

"In order to provide a glimpse of the enormity of the changes that are underway, ProPublica and the Tribune identified nearly three dozen of the most impactful policy changes set in motion by the orders signed on the first day. Most were pulled from the playbook of Trump’s previous presidency. Others are unprecedented."

The new ones are pretty stark, including:

"Ending and clawing back funding from organizations that support migrants: Seeks to stop or limit money to nongovernmental organizations that provide shelter and services to migrants released at the border, as well as legal orientation programs for people in immigration proceedings."

And, of course much has been written about the unconstitutionality of:

"Seeks to end birthright citizenship: Attempts to end birthright citizenship of children born to parents either illegally in the United States or under a temporary legal status, something Trump had only said he wanted to do in his first term."

It's useful to have these written in one place, in an easy-to-digest form, together with updates on what's happened since. The news can feel like a deluge, and aggregating the updates into something parseable is important.

#Democracy

[Link]

Saturday, 08. February 2025

Ben Werdmüller

Shattering the Overton Window

[Natalia Antelava in Coda Story] This is a useful framework for thinking about ongoing harm. "It was 2014, and I was standing in the ruins of Donetsk airport, when a Russian-backed rebel commander launched into what seemed like an oddly academic lecture. Between bursts of artillery fire, he explained an American political science concept: the Overton Window – a theory that d

[Natalia Antelava in Coda Story]

This is a useful framework for thinking about ongoing harm.

"It was 2014, and I was standing in the ruins of Donetsk airport, when a Russian-backed rebel commander launched into what seemed like an oddly academic lecture. Between bursts of artillery fire, he explained an American political science concept: the Overton Window – a theory that describes the range of policies and ideas a society considers acceptable at any given time. Politicians can’t successfully propose anything outside this “window” of acceptability without risking their careers. “The West uses this window,” he said, smoke from his cigarette blowing into my face, “to destroy our traditional values by telling us it’s okay for me to marry a man and for you to marry a woman. But we won’t let them.”"

And that's the real, lasting impact of Trump and his worldview:

"As transactional relationships replace values-based alliances, as oligarchic control displaces democratic institutions, as the unthinkable becomes routine – the transformation of our societies isn’t happening by accident."

What will undoing this take? How can we shift the Overton Window back towards inclusion, communities, and compassion? How can we get to the mutualistic, integrated society we need to reach, and say goodbye to this disgustingly retrograde conservatism for good?

#Democracy

[Link]


Simon Willison

Quoting Salvatore Sanfilippo

[...] We are destroying software with complex build systems. We are destroying software with an absurd chain of dependencies, making everything bloated and fragile. We are destroying software telling new programmers: “Don’t reinvent the wheel!”. But, reinventing the wheel is how you learn how things work, and is the first step to make new, different wheels. [...] — Salvatore Sanfilippo, We

[...] We are destroying software with complex build systems.

We are destroying software with an absurd chain of dependencies, making everything bloated and fragile.

We are destroying software telling new programmers: “Don’t reinvent the wheel!”. But, reinventing the wheel is how you learn how things work, and is the first step to make new, different wheels. [...]

Salvatore Sanfilippo, We are destroying software

Tags: salvatore-sanfilippo, programming, software-engineering


Wrench in the Gears

Proposal For A Spring Break From Mind Viruses – March 13 to the Summer Solstice: Ground, Gather Energy, Reflect, Renew 5 Years Later

I made a video today reflecting on how we manage (or don’t) our energy and how our engagement with digital information complicates our lives. I’m proposing that people consider dialing back their social media consumption to spend time reflecting on their thinking, what we have learned over the past five years, and how we can [...]

I made a video today reflecting on how we manage (or don’t) our energy and how our engagement with digital information complicates our lives. I’m proposing that people consider dialing back their social media consumption to spend time reflecting on their thinking, what we have learned over the past five years, and how we can navigate future challenges with integrity and grace. In the space that is created I am hoping some people will experience ontopoetic synchronicities, conversations with the divine, as we become more intentional about how we direct our consciousness and bodies in this 3D world. I would love to hear back from anyone who gives this a try to see what your experience has been. Life is a stage and we are players. Consider this an unplugged improv opportunity where you get to riff with the quantum.

Words of the day: biosemiotics, memetics, reality tunnel, and metacognition.

Take care with that biocomputer programming. : )

Friday, 07. February 2025

Ben Werdmüller

From COBOL to chaos: Elon Musk, DOGE, and the Evil Housekeeper Problem

[Dan Hon in MIT Technology Review] The always-brilliant Dan Hon on DOGE: "We’re seeing in real time that there are no practical technical measures preventing someone from taking a spanner to the technology that keeps our government stable, that keeps society running every day—despite the very real consequences. So we should plan for the worst, even if the likelihood of the

[Dan Hon in MIT Technology Review]

The always-brilliant Dan Hon on DOGE:

"We’re seeing in real time that there are no practical technical measures preventing someone from taking a spanner to the technology that keeps our government stable, that keeps society running every day—despite the very real consequences.

So we should plan for the worst, even if the likelihood of the worst is low."

The suggestions that follow - identifying risks, working together, standing up and saying "no" - are all sensible and needed.

#Democracy

[Link]


'Reboot' Revealed: Elon Musk's CEO-Dictator Playbook

[Gil Duran in The Nerd Reich] Curtis Yarvin's influence is felt again: " In an essay on his paywalled Substack, he imagined a second Trump presidency in which Trump would enable a radical government transformation. The proposal will sound familiar to anyone who has watched Musk wreak havoc on the United States Government (USG) over the past three weeks." As Duran points ou

[Gil Duran in The Nerd Reich]

Curtis Yarvin's influence is felt again:

" In an essay on his paywalled Substack, he imagined a second Trump presidency in which Trump would enable a radical government transformation. The proposal will sound familiar to anyone who has watched Musk wreak havoc on the United States Government (USG) over the past three weeks."

As Duran points out, none of what's happening right now is exactly new or a surprise:

"What surprises me most is how the political press generally fails to inform the public that Musk is taking a systematic approach, one that has been outlined in public forums for years. (Some press outlets, like the Washington Post and Los Angeles Times, are owned by billionaires keenly interested in kowtowing to Musk and Trump.)"

For many people, the myth of American exceptionalism may be so deeply in their bloodstream that they simply can't imagine our institutions falling to this. But of course they can: this is the country that gave us McCarthyism and Jim Crow. it's happening in plain sight.

#Democracy

[Link]


@_Nat Zone

個人情報保護委員会、「本人関与に係る規律の在り方」の部分について具体的な制度設計の方向性提示

2月5日の個人情報保護委員会において、1月22日資料で事務局ヒアリング結果を踏まえ特に新たに追加した部分の多い「本人関与に係る規律の在り方」の部分(同意規制の在り方、漏えい発生時の本人通知の在り方、子供の個人情報)について具体的な制度設計の方向性を示されました。 https://…

2月5日の個人情報保護委員会において、1月22日資料で事務局ヒアリング結果を踏まえ特に新たに追加した部分の多い「本人関与に係る規律の在り方」の部分(同意規制の在り方、漏えい発生時の本人通知の在り方、子供の個人情報)について具体的な制度設計の方向性を示されました。 https://www.ppc.go.jp/aboutus/minutes/2024/20250205/

これからちゃんと読みますが、例によってOtioに要約を作ってもらいましたのでシェアしておきます。

個人情報保護法の制度的課題に関する考え方(案)の詳細な要約 1. 個人データの取扱いにおける本人関与の規律 本人の同意が不要となる場合について再検討 統計情報作成のためのデータ共有が増加の傾向 特定個人を排除した統計情報の利用は権利侵害のリスクが低いとの認識 行政機関が統計作成に関する場合、例外規定の拡大を提案 2. 本人の意思に反しない取扱い 同意が不要となる条件を明確化 契約履行に必要な場合 プライバシーや権利利益を害さない場合 条件や具体的な範囲を設定する際にステークホルダーの意見を反映 3. 生命保護や公衆衛生のための同意取得 同意取得が難しいケースへの考慮 本人の同意を得ることが困難な状況のほか、「相当の理由」がある場合も例外適用を可能にする提案 プライバシー保護措置が実現されることを前提条件とする 4. 子供の個人情報取扱いに関する規律 子供に適用される年齢基準を明確化 16歳未満を対象とし、特別な保護を設定すべきとの提案 法定代理人からの同意取得を原則としつつ、例外的に本人の同意を認める場合も考慮 未成年者の最善の利益を考慮した取り扱い 事業者に対して責務規定を設置する提案 行政機関にも同様に責務を設ける改正の必要性


Simon Willison

Quoting Jared Palmer

Confession: we've been hiding parts of v0's responses from users since September. Since the launch of DeepSeek's web experience and its positive reception, we realize now that was a mistake. From now on, we're also showing v0's full output in every response. This is a much better UX because it feels faster and it teaches end users how to prompt more effectively. — Jared Palmer, VP of AI at Ver

Confession: we've been hiding parts of v0's responses from users since September. Since the launch of DeepSeek's web experience and its positive reception, we realize now that was a mistake. From now on, we're also showing v0's full output in every response. This is a much better UX because it feels faster and it teaches end users how to prompt more effectively.

Jared Palmer, VP of AI at Vercel

Tags: ux, prompt-engineering, vercel, deepseek, generative-ai, ai, llms


Using pip to install a Large Language Model that's under 100MB

I just released llm-smollm2, a new plugin for LLM that bundles a quantized copy of the SmolLM2-135M-Instruct LLM inside of the Python package. This means you can now pip install a full LLM! If you're already using LLM you can install it like this: llm install llm-smollm2 Then run prompts like this: llm -m SmolLM2 'Are dogs real?' (New favourite test prompt for tiny models, courtes

I just released llm-smollm2, a new plugin for LLM that bundles a quantized copy of the SmolLM2-135M-Instruct LLM inside of the Python package.

This means you can now pip install a full LLM!

If you're already using LLM you can install it like this:

llm install llm-smollm2

Then run prompts like this:

llm -m SmolLM2 'Are dogs real?'

(New favourite test prompt for tiny models, courtesy of Tim Duffy. Here's the result).

If you don't have LLM yet first follow these installation instructions, or brew install llm or pipx install llm or uv tool install llm depending on your preferred way of getting your Python tools.

If you have uv setup you don't need to install anything at all! The following command will spin up an ephemeral environment, install the necessary packages and start a chat session with the model all in one go:

uvx --with llm-smollm2 llm chat -m SmolLM2

Finding a tiny model Building the plugin Packaging the plugin Publishing to PyPI Is the model any good? Finding a tiny model

The fact that the model is almost exactly 100MB is no coincidence: that's the default size limit for a Python package that can be uploaded to the Python Package Index (PyPI).

I asked on Bluesky if anyone had seen a just-about-usable GGUF model that was under 100MB, and Artisan Loaf pointed me to SmolLM2-135M-Instruct.

I ended up using this quantization by QuantFactory just because it was the first sub-100MB model I tried that worked.

Trick for finding quantized models: Hugging Face has a neat "model tree" feature in the side panel of their model pages, which includes links to relevant quantized models. I find most of my GGUFs using that feature.

Building the plugin

I first tried the model out using Python and the llama-cpp-python library like this:

uv run --with llama-cpp-python python

Then:

from llama_cpp import Llama from pprint import pprint llm = Llama(model_path="SmolLM2-135M-Instruct.Q4_1.gguf") output = llm.create_chat_completion(messages=[ {"role": "user", "content": "Hi"} ]) pprint(output)

This gave me the output I was expecting:

{'choices': [{'finish_reason': 'stop', 'index': 0, 'logprobs': None, 'message': {'content': 'Hello! How can I assist you today?', 'role': 'assistant'}}], 'created': 1738903256, 'id': 'chatcmpl-76ea1733-cc2f-46d4-9939-90efa2a05e7c', 'model': 'SmolLM2-135M-Instruct.Q4_1.gguf', 'object': 'chat.completion', 'usage': {'completion_tokens': 9, 'prompt_tokens': 31, 'total_tokens': 40}}

But it also spammed my terminal with a huge volume of debugging output - which started like this:

llama_model_load_from_file_impl: using device Metal (Apple M2 Max) - 49151 MiB free llama_model_loader: loaded meta data with 33 key-value pairs and 272 tensors from SmolLM2-135M-Instruct.Q4_1.gguf (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = llama

And then continued for more than 500 lines!

I've had this problem with llama-cpp-python and llama.cpp in the past, and was sad to find that the documentation still doesn't have a great answer for how to avoid this.

So I turned to the just released Gemini 2.0 Pro (Experimental), because I know it's a strong model with a long input limit.

I ran the entire llama-cpp-python codebase through it like this:

cd /tmp git clone https://github.com/abetlen/llama-cpp-python cd llama-cpp-python files-to-prompt -e py . -c | llm -m gemini-2.0-pro-exp-02-05 \ 'How can I prevent this library from logging any information at all while it is running - no stderr or anything like that'

Here's the answer I got back. It recommended setting the logger to logging.CRITICAL, passing verbose=False to the constructor and, most importantly, using the following context manager to suppress all output:

from contextlib import contextmanager, redirect_stderr, redirect_stdout @contextmanager def suppress_output(): """ Suppresses all stdout and stderr output within the context. """ with open(os.devnull, "w") as devnull: with redirect_stdout(devnull), redirect_stderr(devnull): yield

This worked! It turned out most of the output came from initializing the LLM class, so I wrapped that like so:

with suppress_output(): model = Llama(model_path=self.model_path, verbose=False)

Proof of concept in hand I set about writing the plugin. I started with my simonw/llm-plugin cookiecutter template:

uvx cookiecutter gh:simonw/llm-plugin [1/6] plugin_name (): smollm2 [2/6] description (): SmolLM2-135M-Instruct.Q4_1 for LLM [3/6] hyphenated (smollm2): [4/6] underscored (smollm2): [5/6] github_username (): simonw [6/6] author_name (): Simon Willison

The rest of the plugin was mostly borrowed from my existing llm-gguf plugin, updated based on the latest README for the llama-cpp-python project.

There's more information on building plugins in the tutorial on writing a plugin.

Packaging the plugin

Once I had that working the last step was to figure out how to package it for PyPI. I'm never quite sure of the best way to bundle a binary file in a Python package, especially one that uses a pyproject.toml file... so I dumped a copy of my existing pyproject.toml file into o3-mini-high and prompted:

Modify this to bundle a SmolLM2-135M-Instruct.Q4_1.gguf file inside the package. I don't want to use hatch or a manifest or anything, I just want to use setuptools.

Here's the shared transcript - it gave me exactly what I wanted. I bundled it by adding this to the end of the toml file:

[tool.setuptools.package-data] llm_smollm2 = ["SmolLM2-135M-Instruct.Q4_1.gguf"]

Then dropping that .gguf file into the llm_smollm2/ directory and putting my plugin code in llm_smollm2/__init__.py.

I tested it locally by running this:

python -m pip install build python -m build

I fired up a fresh virtual environment and ran pip install ../path/to/llm-smollm2/dist/llm_smollm2-0.1-py3-none-any.whl to confirm that the package worked as expected.

Publishing to PyPI

My cookiecutter template comes with a GitHub Actions workflow that publishes the package to PyPI when a new release is created using the GitHub web interface. Here's the relevant YAML:

deploy: runs-on: ubuntu-latest needs: [test] environment: release permissions: id-token: write steps: - uses: actions/checkout@v4 - name: Set up Python uses: actions/setup-python@v5 with: python-version: "3.13" cache: pip cache-dependency-path: pyproject.toml - name: Install dependencies run: | pip install setuptools wheel build - name: Build run: | python -m build - name: Publish uses: pypa/gh-action-pypi-publish@release/v1

This runs after the test job has passed. It uses the pypa/gh-action-pypi-publish Action to publish to PyPI - I wrote more about how that works in this TIL.

Is the model any good?

This one really isn't! It's not really surprising but it turns out 94MB really isn't enough space for a model that can do anything useful.

It's super fun to play with, and I continue to maintain that small, weak models are a great way to help build a mental model of how this technology actually works.

That's not to say SmolLM2 isn't a fantastic model family. I'm running the smallest, most restricted version here. SmolLM - blazingly fast and remarkably powerful describes the full model family - which comes in 135M, 360M, and 1.7B sizes. The larger versions are a whole lot more capable.

If anyone can figure out something genuinely useful to do with the 94MB version I'd love to hear about it.

Tags: pip, plugins, projects, pypi, python, ai, github-actions, generative-ai, edge-llms, llms, ai-assisted-programming, llm, gemini, uv, smollm, o3, llama-cpp


sqlite-s3vfs

sqlite-s3vfs Neat open source project on the GitHub organisation for the UK government's Department for Business and Trade: a "Python virtual filesystem for SQLite to read from and write to S3." I tried out their usage example by running it in a Python REPL with all of the dependencies uv run --python 3.13 --with apsw --with sqlite-s3vfs --with boto3 python It worked as advertised. When I

sqlite-s3vfs

Neat open source project on the GitHub organisation for the UK government's Department for Business and Trade: a "Python virtual filesystem for SQLite to read from and write to S3."

I tried out their usage example by running it in a Python REPL with all of the dependencies

uv run --python 3.13 --with apsw --with sqlite-s3vfs --with boto3 python

It worked as advertised. When I listed my S3 bucket I found it had created two files - one called demo.sqlite/0000000000 and another called demo.sqlite/0000000001, both 4096 bytes because each one represented a SQLite page.

The implementation is just 200 lines of Python, implementing a new SQLite Virtual Filesystem on top of apsw.VFS.

The README includes this warning:

No locking is performed, so client code must ensure that writes do not overlap with other writes or reads. If multiple writes happen at the same time, the database will probably become corrupt and data be lost.

I wonder if the conditional writes feature added to S3 back in November could be used to protect against that happening. Tricky as there are multiple files involved, but maybe it (or a trick like this one) could be used to implement some kind of exclusive lock between multiple processes?

Via Hacker News comment

Tags: apsw, sqlite, python, uv, s3


APSW SQLite query explainer

APSW SQLite query explainer Today I found out about APSW's (Another Python SQLite Wrapper, in constant development since 2004) apsw.ext.query_info() function, which takes a SQL query and returns a very detailed set of information about that query - all without executing it. It actually solves a bunch of problems I've wanted to address in Datasette - like taking an arbitrary query and figuring o

APSW SQLite query explainer

Today I found out about APSW's (Another Python SQLite Wrapper, in constant development since 2004) apsw.ext.query_info() function, which takes a SQL query and returns a very detailed set of information about that query - all without executing it.

It actually solves a bunch of problems I've wanted to address in Datasette - like taking an arbitrary query and figuring out how many parameters (?) it takes and which tables and columns are represented in the result.

I tried it out in my console (uv run --with apsw python) and it seemed to work really well. Then I remembered that the Pyodide project includes WebAssembly builds of a number of Python C extensions and was delighted to find apsw on that list.

... so I got Claude to build me a web interface for trying out the function, using Pyodide to run a user's query in Python in their browser via WebAssembly.

Claude didn't quite get it in one shot - I had to feed it the URL to a more recent Pyodide and it got stuck in a bug loop which I fixed by pasting the code into a fresh session.

Tags: pyodide, sqlite, claude, ai, llms, claude-artifacts, webassembly, ai-assisted-programming, python, generative-ai, apsw

Thursday, 06. February 2025

Ben Werdmüller

The Key Figures Working Alongside Elon Musk at DOGE and in the Trump Administration

[Avi Asher-Schapiro, Christopher Bing, Annie Waldman, Brett Murphy, Andy Kroll, Justin Elliott, Kirsten Berg, Sebastian Rotella, Alex Mierjeski, Pratheek Rebala and Al Shaw at ProPublica] My colleagues at ProPublica have published the largest list yet of who is actually involved in DOGE: "While some have been public about their involvement, others have attempted to keep thei

[Avi Asher-Schapiro, Christopher Bing, Annie Waldman, Brett Murphy, Andy Kroll, Justin Elliott, Kirsten Berg, Sebastian Rotella, Alex Mierjeski, Pratheek Rebala and Al Shaw at ProPublica]

My colleagues at ProPublica have published the largest list yet of who is actually involved in DOGE:

"While some have been public about their involvement, others have attempted to keep their roles secret, scrubbing LinkedIn pages and other sources of data. With little information from the White House, ProPublica is attempting to document who is involved and what they are doing."

This is a living document: ProPublica is still reporting. As the article points out:

"We are still reporting. Do you have information about any of the people listed below? Do you know of any other Musk associates who have entered the federal government? You can reach our tip line. Please be as specific, detailed and clear as you can."

The whole list is worth reviewing.

#Democracy

[Link]


Simon Willison

Datasette 1.0a17

Datasette 1.0a17 New Datasette alpha, with a bunch of small changes and bug fixes accumulated over the past few months. Some (minor) highlights: The register_magic_parameters(datasette) plugin hook can now register async functions. (#2441) Breadcrumbs on database and table pages now include a consistent self-link for resetting query string parameters. (#2454) New internal methods datase

Datasette 1.0a17

New Datasette alpha, with a bunch of small changes and bug fixes accumulated over the past few months. Some (minor) highlights:

The register_magic_parameters(datasette) plugin hook can now register async functions. (#2441) Breadcrumbs on database and table pages now include a consistent self-link for resetting query string parameters. (#2454) New internal methods datasette.set_actor_cookie() and datasette.delete_actor_cookie(), described here. (#1690) /-/permissions page now shows a list of all permissions registered by plugins. (#1943) If a table has a single unique text column Datasette now detects that as the foreign key label for that table. (#2458) The /-/permissions page now includes options for filtering or exclude permission checks recorded against the current user. (#2460)

I was incentivized to push this release by an issue I ran into in my new datasette-load plugin, which resulted in this fix:

Fixed a bug where replacing a database with a new one with the same name did not pick up the new database correctly. (#2465)

Tags: projects, annotated-release-notes, datasette


sqlite-page-explorer

sqlite-page-explorer Outstanding tool by Luke Rissacher for understanding the SQLite file format. Download the application (built using redbean and Cosmopolitan, so the same binary runs on Windows, Mac and Linux) and point it at a SQLite database to get a local web application with an interface for exploring how the file is structured. Here's it running against the datasette.io/content databa

sqlite-page-explorer

Outstanding tool by Luke Rissacher for understanding the SQLite file format. Download the application (built using redbean and Cosmopolitan, so the same binary runs on Windows, Mac and Linux) and point it at a SQLite database to get a local web application with an interface for exploring how the file is structured.

Here's it running against the datasette.io/content database that runs the official Datasette website:

Via Hacker News

Tags: redbean, sqlite, cosmopolitan, lua


The future belongs to idea guys who can just do things

The future belongs to idea guys who can just do things Geoffrey Huntley with a provocative take on AI-assisted programming: I seriously can't see a path forward where the majority of software engineers are doing artisanal hand-crafted commits by as soon as the end of 2026. He calls for companies to invest in high quality internal training and create space for employees to figure out these

The future belongs to idea guys who can just do things

Geoffrey Huntley with a provocative take on AI-assisted programming:

I seriously can't see a path forward where the majority of software engineers are doing artisanal hand-crafted commits by as soon as the end of 2026.

He calls for companies to invest in high quality internal training and create space for employees to figure out these new tools:

It's hackathon (during business hours) once a month, every month time.

Geoffrey's concluding note resonates with me. LLMs are a gift to the fiercely curious and ambitious:

If you’re a high agency person, there’s never been a better time to be alive...

Tags: ai-assisted-programming, llms, ai, generative-ai


The Pragmatic Engineer

The Pulse #123: Big Tech using its distribution advantage to win in AI?

Big Tech companies seem to have learned the lesson on how they became dominant in the past. So most of them are giving away AI functionality for free – even if this could be at loss for them.

The Pulse is a series covering insights, patterns, and trends within Big Tech and startups. Notice an interesting event or trend? Send me a message.

Today, we cover:

Industry pulse. Where TikTok’s software engineers went, Apple Intelligence half-baked at launch, former FTC chair Lina Khan warns of reduced competition, OpenAI launches autonomous browser agent, Apple drops its AR glasses, with Meta also facing struggles – and more

Big Tech using distribution to win at AI? Meta, Microsoft, and Google are using their distribution power and financial muscle to give away AI features for free to users and customers. It’s a sensible way to avoid disruption, which makes it much harder for startups to compete

Before we start: thank you for all the thoughtful comments on last week’s issue about DeepSeek rocking the tech industry. I corrected a few details based on feedback – most notably how DeepSeek’s cost efficiency seems to be closer to 4-5x that of OpenAI, not 10-40x. See the updated issue here.

1. Industry Pulse Where did TikTok’s software engineers go?

The past six months has been something of a Doomsday scenario-esque countdown for TikTok, as the start date of its ban in the US crept ever closer. In the event, TikTok did indeed go offline for a few hours on 19 January, before President Trump gave the social network a stay of execution lasting 75 days.

How has this uncertainty affected software engineers at the Chinese-owned social network? According to data shared exclusively with The Pragmatic Engineer by Live Data Technologies, which tracks real-time live job change data across more than 160M professionals, this is how:

Where engineers went after TikTok, and where they joined from. Source: Live Data Technologies

There’s been an outflow of engineers to:

Big Tech: Meta, Google, Amazon (and AWS), Microsoft and Apple.

Publicly traded tech companies: Snap, Pinterest, Snowflake, DoorDash, NVIDIA, Coinbase, Coupang, Salesforce, Uber

Few private companies: X (formerly: Twitter) and OpenAI

It seems the uncertainty has motivated TikTok engineers to interview and quit when they get an offer. Still, I find it surprising that hardly any scaleups are popular destinations among TikTok leavers. To me, it indicates that people quit for better liquid total compensation packages; which may be why publicly traded companies are the most popular destination.

Apple Intelligence half-baked?

Read more


Ben Werdmüller

Update on the 2024/2025 End of Term Web Archive

[Caralee Adams at the Internet Archive] The Internet Archive is always a gem, but it's been particularly important this year. "With two-thirds of the process complete, the 2024/2025 EOT crawl has collected more than 500 terabytes of material, including more than 100 million unique web pages. All this information, produced by the U.S. government—the largest publisher in the w

[Caralee Adams at the Internet Archive]

The Internet Archive is always a gem, but it's been particularly important this year.

"With two-thirds of the process complete, the 2024/2025 EOT crawl has collected more than 500 terabytes of material, including more than 100 million unique web pages. All this information, produced by the U.S. government—the largest publisher in the world—is preserved and available for public access at the Internet Archive.

[...] As an added layer of preservation, the 2024/2025 EOT Web Archive will be uploaded to the Filecoin network for long-term storage, where previous term archives are already stored. While separate from the EOT collaboration, this effort is part of the Internet Archive’s Democracy’s Library project. Filecoin Foundation (FF) and Filecoin Foundation for the Decentralized Web (FFDW) support Democracy’s Library to ensure public access to government research and publications worldwide."

This is important on multiple levels: most importantly, it means that even if the Internet Archive is attacked or shut down for any reason, these archived versions of government websites and data will remain online and accessible.

As it happens, the current administration has been pulling down datasets and redacting websites with wild abandon, so although this is a routine activity for the Archive whenever there's a change in administration, it provides a vital historical record this year. Good news for researchers, future historians, journalists, and anyone who depended on this data.

#Democracy

[Link]


Simon Willison

Quoting Andrej Karpathy

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like "decrease the padding on the sidebar by half"

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard.

I ask for the dumbest things like "decrease the padding on the sidebar by half" because I'm too lazy to find it. I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I'd have to really read through it for a while. Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away.

It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.

Andrej Karpathy

Tags: andrej-karpathy, ai-assisted-programming, generative-ai, ai, llms


Ben Werdmüller

You Can’t Post Your Way Out of Fascism

[Janus Rose at 404 Media] This is an important but hard pill to swallow: "“The reality is you are oxygenating the things these people are saying even as you purport to debunk them,” Katherine Cross, a sociologist and author of Log Off: Why Posting and Politics (Almost) Never Mix, told 404 Media. “Whether it’s [New York Times columnist] Ross Douthat providing a sane-washing g

[Janus Rose at 404 Media]

This is an important but hard pill to swallow:

"“The reality is you are oxygenating the things these people are saying even as you purport to debunk them,” Katherine Cross, a sociologist and author of Log Off: Why Posting and Politics (Almost) Never Mix, told 404 Media. “Whether it’s [New York Times columnist] Ross Douthat providing a sane-washing gloss on Trump’s mania or people on social media vehemently disagreeing and dunking on it, they’re legitimizing it as part of the discourse.”"

Posting is not activism. But it's both easy and cathartic to take the bait and run with it - and get approving clicks and likes in return. In sharing outrage rather than concrete real-world steps, we end up just amplifying the message.

As Janus Rose points out:

"Under this status quo, everything becomes a myopic contest of who can best exploit peoples’ anxieties to command their attention and energy. If we don’t learn how to extract ourselves from this loop, none of the information we gain will manifest as tangible action—and the people in charge prefer it that way."

Instead, co-ordinate online but manifest in the real world. Join protests, call your representatives, work for organizations that seek to uncover truth and take steps forward. Fewer hot takes; more collective action.

#Democracy

[Link]


Doc Searls Weblog

What Makes an AI personal?

The unintended risks of handing over our lives to Personal AI is the headline of the opening segment of Jamie Smith‘s newsletter today. In it he shares a post by Liz Gerber about an autonomous choice made by the AI in her self-parking car: Last night, my car automatically parked in the grocery store parking […]
The image above answered this prompt to ChatGPT: “Give me an image of a personal AI: one belonging to the individual and in control of the individual.” I added the text. It’s not perfect, or even close. But I’m rushed and not sure iterating this will get us anything better.

The unintended risks of handing over our lives to Personal AI is the headline of the opening segment of Jamie Smith‘s newsletter today. In it he shares a post by Liz Gerber about an autonomous choice made by the AI in her self-parking car:

Last night, my car automatically parked in the grocery store parking lot. The car beautifully maneuvered itself into a parking spot avoiding colliding with other cars and people walking through the lot.

However the car prioritized performance, fine tuning its movement in the parking spot as my fellow shoppers waited in below freezing temperatures for the car to do a perfect job.

The car failed to register that one shopper was a mom with a crying child in her arms and a toddler tugging at her leg. And it failed to take into account the several cars that were waiting to turn into the parking lot, causing a backup on the street.

While the experience was delightful for me, the car was inconsiderate of others. Even worse, I didn’t notice the others until after the car had parked itself.

I didn’t stop my car from prioritizing my needs over others. I not only handed over steering to my car, I handed over my humanity.

Jamie then asks, What happens when we hand over our experiences – often very human experiences – to an AI agent? and added in summary, Digital wallets and Personal AI are going to have a transformative effect on our lives and society. But let’s make sure we don’t hand over our humanity with it. 

Here’s another question: Is the AI in Liz’s car personal?

If we want truly personal AI—the kind under our control, working for us, enlarging our agency in the world—the answer to that question has to be no.

The AI in Liz’s car is not hers. It belongs to the car’s maker. It does not contain or express her humanity or extend her agency. It may be personalized, but it is not personal. (About the difference.)

We don’t have personal AI yet. (One bit of evidence is the image ChatGPT generated for me, above.) Yes, there is plentiful personalized AI, but nothing as personal as your pants, your PC, or the car you operate entirely on your own. Nothing that makes you more independent, not less.

We’re working at Kwaai on an open-source personal AI operating system (pAI-OS). There are other projects as well. But we don’t yet have a personal AI that does all the things this image suggests—

—much less an AI of our own that can express our ethical preferences.

Meanwhile, we need to be careful not to call the AIs in our things (especially big dangerous ones, such as cars) personal. They are not.

Personal AI is a greenfield. And there is a risk nothing will grow in it if we keep calling stuff like a car’s autonomous parking “personal AI.”

Oh, and you won’t find a better voice for personal empowerment in the digital world than Jamie Smith‘s. Please subscribe to his newsletter.

 


Ben Werdmüller

Move fast and break democracy

For the last few years, AI vendors have had an interesting marketing playbook: they’ve described the potential power of the technologies as being so great that it could lead to an artificial general intelligence that could either kill humanity or leave us behind and head for the stars. We ignore its power at our peril. As it turned out, OpenAI and Microsoft’s definition of “artificial general

For the last few years, AI vendors have had an interesting marketing playbook: they’ve described the potential power of the technologies as being so great that it could lead to an artificial general intelligence that could either kill humanity or leave us behind and head for the stars. We ignore its power at our peril.

As it turned out, OpenAI and Microsoft’s definition of “artificial general intelligence” was that the technologies would reach one hundred billion dollars in revenue. It wasn’t tied to capabilities around reasoning, and did not, in actuality, relate to a Terminator future. It just meant that they’d be making a lot of money from it. All the talk of humanity-destroying intelligence and the existential questions that derived from it just served to draw attention to their services. The awe inspired by the tales they were weaving would, they hoped, lead to more signed contracts, more subscribers, more dollars on their balance sheets. People would treat the technologies as being insanely powerful even if they weren’t, and that would be enough.

A decade or more ago, a new ride-sharing service called Uber started to supplant taxi services in major cities like San Francisco. While taxi services were typically licensed, often at great cost to the individual drivers, Uber drivers operated without any such restrictions. It was illegal in many cities, but the company intentionally created workarounds to prevent police, city officials, and taxi firms from gathering evidence. A tool nicknamed Greyball allowed them to tag users who they decided were trying to conduct a sting on the service. Those users would see fake cars, and their drivers would cancel quickly. In the midst of this disinformation, it became hard to gather real evidence and make a case.

Eventually, despite its illegality, Uber became saturated in each market. Cities found themselves either acquiescing or making regulatory deals with the company. Uber had evaded the authorities while growing quickly, and it became widely used. It was clear that cities were going to have trouble shutting it down, so they ultimately adjusted to accept its existence. Law enforcement had been too slow; Uber had outrun and outmaneuvered it, and now it was here to stay.

The same playbooks that have allowed high-growth tech companies to become effective monopolies in America are now being used on American governance itself.

Donald Trump is not a king and does not have the right to wield absolute power. He and his parties control all three branches of government, the executive, legislative, and judicial branches are all Republican-dominated, but avenues for objection, checks on his power, and levers to limit his reach remain. But that doesn’t necessarily matter: Donald Trump is acting like a king. He is restructuring the government as if he were one, making statement after statement to reinforce that image. Much of it is hot air: things that will never come to pass. But just as if AI vendors pretend all-powerful artificial intelligence exists, people will act as if it does, I believe Trump’s CEO king act is designed to make us act as if there are no checks or limits on his abilities. We are meant to gaze in awe, and his critics to feel despondent, so that he can cement his imaginary powers for real and conduct his illegal business with impunity regardless of the regulations.

DOGE, which subsumed the USDS to become the awkwardly-named United States Department of Government Efficiency Service, is running ahead of regulations with the same gusto that Uber did during its early years. It should go without saying that inviting recent high school graduates and early twenty-somethings with no security clearance to wantonly access the personal data of every American, and to alter the source code that controls core government services, is illegal. It’s so outlandish that it sounds absolutely bizarre when you describe it out loud, like something from a speculative fiction fever dream, but it’s happening in plain sight. There are plenty of rules in place to prevent their activities from taking place. But who is going to catch up to them?

Eventually, DOGE will either be stopped or face regulatory restrictions on its activities and reach. But by then, it will be too late: the code will be altered, the personal information will be revealed, the funding spigot to core government services will have withered them on the vine. Legal objections have peppered up everywhere, but the cogs of justice are far slower than a bunch of entrepreneurial kids with the keys to the city. Lawmakers and civil rights organizations can shake their fists and say it’s illegal, but it’s done. DOGE isn’t just evading oversight: it’s moving fast and breaking things on a scale even Uber never dreamed of. It’s governance as a high-growth startup, where rule-breaking isn’t a side effect — it’s the entire strategy.

The important thing isn’t so much who is doing it as what is being done. Much has been made of the fact that Elon Musk is unelected, which is true: he is a private citizen with highly personal motives doing this work under dubious auspices. But the events of the last few weeks would be heinous even if they were conducted directly by elected officials acting in good faith. Stopping Musk from doing these things is a good idea, but the core problem is the acts, not the man.

The question, then, is what we do next.

In the New York Times, Jamelle Bouie points out that this wasn’t what brought most Trump voters to the polls:

For as much as some of Trump’s and Musk’s moves were anticipated in Project 2025, the fact of the matter is that marginal Trump voters — the voters who gave him his victory — did not vote for any of this. They voted specifically to lower the cost of living. They did not vote, in Musk’s words, for economic “hardship.” Nor did they vote to make Musk the co-president of the United States or to give Trump the power to destroy the capacity of the federal government to do anything that benefits the American people. They certainly did not vote for a world where the president’s billionaire ally has access to your Social Security number.

One task is to pierce the reality distortion field of Trump’s court in the eyes of his opponents. We don’t live in a full-scale dictatorship (at least, not yet). All of this can be stopped. His power is limited, and can be curtailed. And at the center of it all, he is a small-minded former reality TV star with a tiny worldview who eats his steak overcooked and throws his plate at the wall when he’s having a tantrum. The emperor has no clothes, and those that oppose him must see that clearly. The bigger task is revealing that fact to the more reasonable of the people who elected him: people for whom the cost of living is more important than enacting some kind of perverse revenge on inclusive society.

Then I believe the next task is to build an alternative, not in reaction to Trump, but in itself, based on upholding core values and improving everybody’s quality of life. One of the challenges of being aghast at what is going on is that American institutions really have underserved the American people, and have often caused real harm overseas. It’s easy — and correct — to be worried about what it means to suddenly encourage the entire CIA to resign, but it’s an awkward rhetorical position to be put in to defend the institution. The CIA has a long history of arguably criminal behavior: conducting undemocratic coups, assassinating world leaders, and violating human rights in our name.

The status quo doesn’t work. The American people have made that clear. So it’s on us to invent something new. What does it mean to create a truly inclusive, peaceful, democratic society? What does it mean to have a peaceful foreign policy? What does it mean to focus on improving quality of life rather than an economic metric that encourages monopolies and billionaires while letting ordinary people suffer?

The playbooks of OpenAI, Uber, and others have long been countered by other modes of operating. Hockey-stick growth is not the only way to build software and serve people who need help. Co-operation, mutual aid, and collective collaboration have effectively re-made software, and through it the world, and we’re now seeing the fruit of that through movements like the open social web. High-growth tech has the flashy marketing moves and the attendant hype cycle, but quietly, other movements have been steadily building. The same is true for America.

As Bouie says in his piece:

Whatever comes next, should the country weather this attempted hijacking, will need to be a fundamental rethinking of what this system is and what we want out of it.

Anything less will set us up for yet another Trump and yet another Musk.

I believe this is correct, and offer this idea for consideration:

The people with the ideas that can best save America are the people who are currently being pushed out of it. This is not a coincidence. Black women, trans activists, communities built on radical inclusion and emergent strategies, worker’s groups and communities bound in solidarity have created modes of communication and support that have transformed American society of the better. These are people for whom the shock and awe of a smoke and mirrors campaign does not work; who cannot be convinced to fit into a template designed to force people into being someone else’s profit engine; who have demonstrated the unstoppable nature of peer to peer mutual aid. It makes them dangerous. It also makes them more powerful than the dying gasp of the twentieth century we’re seeing sputter out before us.

We should listen to them: people who are often at the edges even though they deserve to sit at the center of society. They often see harms perpetuated before everybody else; they often see the solutions first, too. It’s not that it’s on them to save everybody else. It’s that they’ve been sounding the alarm and telling us what to do for decades, and nobody has been listening. It’s about time we did.

The same playbooks that have created monopolies, crushed labor rights, and gamed regulations are now being used to gut democratic governance itself. But these playbooks have always had an alternative: one rooted in cooperation, mutual aid, and community-driven solutions. That alternative exists; it’s just been drowned out by billionaires and venture-backed empire-builders. It’s time to listen to the people who have been building it all along.

Wednesday, 05. February 2025

IdM Laboratory

そういえばEUDIW Architecture Reference Framework 1.5.0が出てますね

こんにちは、富士榮です。 そういえば2月4日にEUDIW ARFの1.5.0が出てますね。 https://eu-digital-identity-wallet.github.io/eudi-doc-architecture-and-reference-framework/latest/architecture-and-reference-framework-main/ GithubのCHANGELOGを見ると The ARF is aligned with the adopted Implementing Acts, covering articles 5a and 5c of the eIDAS Regulation.  The ARF also includes changes in response to comments provid
こんにちは、富士榮です。
そういえば2月4日にEUDIW ARFの1.5.0が出てますね。 https://eu-digital-identity-wallet.github.io/eudi-doc-architecture-and-reference-framework/latest/architecture-and-reference-framework-main/


GithubのCHANGELOGを見ると The ARF is aligned with the adopted Implementing Acts, covering articles 5a and 5c of the eIDAS Regulation.  The ARF also includes changes in response to comments provided on Github and by other stakeholders. Over more than 275 comments lead to changes in the ARF. とのことです。 まぁ、中を見ろ、と。

Simon Willison

S1: The $6 R1 Competitor?

S1: The $6 R1 Competitor? Tim Kellogg shares his notes on a new paper, s1: Simple test-time scaling, which describes an inference-scaling model fine-tuned on top of Qwen2.5-32B-Instruct for just $6 - the cost for 26 minutes on 16 NVIDIA H100 GPUs. Tim highlight the most exciting result: After sifting their dataset of 56K examples down to just the best 1K, they found that the core 1K is all

S1: The $6 R1 Competitor?

Tim Kellogg shares his notes on a new paper, s1: Simple test-time scaling, which describes an inference-scaling model fine-tuned on top of Qwen2.5-32B-Instruct for just $6 - the cost for 26 minutes on 16 NVIDIA H100 GPUs.

Tim highlight the most exciting result:

After sifting their dataset of 56K examples down to just the best 1K, they found that the core 1K is all that's needed to achieve o1-preview performance on a 32B model.

The paper describes a technique called "Budget forcing":

To enforce a minimum, we suppress the generation of the end-of-thinking token delimiter and optionally append the string “Wait” to the model’s current reasoning trace to encourage the model to reflect on its current generation

That's the same trick Theia Vogel described a few weeks ago.

Here's the s1-32B model on Hugging Face. I found a GGUF version of it at brittlewis12/s1-32B-GGUF, which I ran using Ollama like so:

ollama run hf.co/brittlewis12/s1-32B-GGUF:Q4_0

I also found those 1,000 samples on Hugging Face in the simplescaling/s1K data repository there.

I used DuckDB to convert the parquet file to CSV (and turn one VARCHAR[] column into JSON):

COPY ( SELECT solution, question, cot_type, source_type, metadata, cot, json_array(thinking_trajectories) as thinking_trajectories, attempt FROM 's1k-00001.parquet' ) TO 'output.csv' (HEADER, DELIMITER ',');

Then I loaded that CSV into sqlite-utils so I could use the convert command to turn a Python data structure into JSON using json.dumps() and eval():

# Load into SQLite sqlite-utils insert s1k.db s1k output.csv --csv # Fix that column sqlite-utils convert s1k.db s1u metadata 'json.dumps(eval(value))' --import json # Dump that back out to CSV sqlite-utils rows s1k.db s1k --csv > s1k.csv

Here's that CSV in a Gist, which means I can load it into Datasette Lite.

It really is a tiny amount of training data. It's mostly math and science, but there are also 15 cryptic crossword examples.

Tags: duckdb, datasette-lite, inference-scaling, ai, ollama, llms, datasette, generative-ai, qwen


The Pragmatic Engineer

AI Engineering with Chip Huyen

On today’s episode of The Pragmatic Engineer, I’m joined by Chip Huyen, a computer scientist, author of the freshly published O’Reilly book AI Engineering, and an expert in applied machine learning.

Before we start: a rare giveaway! To celebrate the podcast’s YouTube channel crossing 100K subscribers, I’m doing a giveaway of 100 books (10 of them signed and shipped) and 100 newsletter subscriptions. To enter, just subscribe to the podcast on the platform you use, leave a star review, and fill out this form, indicating you’d like to participate. The giveaway closes in 72 hours. More details here.

Stream the Latest Episode

Available now on YouTube, Apple and Spotify. See the episode transcript at the top of this page, and a summary at the bottom.

Brought to You By

Swarmia — The engineering intelligence platform for modern software organizations.

Graphite — The AI developer productivity platform.

Vanta — Automate compliance and simplify security with Vanta.

In This Episode

On today’s episode of The Pragmatic Engineer, I’m joined by Chip Huyen, a computer scientist, author of the freshly published O’Reilly book AI Engineering, and an expert in applied machine learning. Chip has worked as a researcher at Netflix, was a core developer at NVIDIA (building NeMo, NVIDIA’s GenAI framework), and co-founded Claypot AI. She also taught Machine Learning at Stanford University.

In this conversation, we dive into the evolving field of AI Engineering and explore key insights from Chip’s book, including:

• How AI Engineering differs from Machine Learning Engineering

• Why fine-tuning is usually not a tactic you’ll want (or need) to use

• The spectrum of solutions to customer support problems – some not even involving AI!

• The challenges of LLM evals (evaluations)

• Why project-based learning is valuable—but even better when paired with structured learning

• Exciting potential use cases for AI in education and entertainment

• And more!

Takeaways

My biggest takeaways from this episode:

1. “AI engineering” feels closer to software engineering than to ML engineering. The term itself is very new, and “AI engineering” evolved from ML engineering. A big difference is that thanks to LLMs being easy to use (both via APIs, and locally) “AI engineering” is much more about building a product first – and later on, getting around to tweaking the model itself. ML engineering was the other way around: spend a lot of time building a model, and then build an application on it.

2. To get good at AI engineering, focus on the basics. Understand what an LLM is (and how it works), how to evaluate them, how to use RAG, what finetuning is, and how to optimize inference. All of these techniques are foundational, and will remain important in a few years’ time as well. Chip’s book AI Engineering covers all these topics.

3. “The more we want to not miss out on things, the more things we will miss.” When I asked Chip about what she thinks about the fear of missing out (FOMO) across AI: she said it’s ironic that when you are worried about missing out, and keep up with everything new – you stay shallow! Chip doesn’t read news: she chooses to focus on deep research or learning instead. If there’s something that is important today: it will be important next week, after she’s done finishing learning whatever she was focused on.

4. Solving the problem is more important than using the latest AI tools. Amusingly, a lot of teams miss this part: and they build overcompliacated AI solutions that do practically nothing for the business.

Before the interview, I’ve started to go through Chip’s newly published book: AI Engineering. It feels to me this book will become a desk reference, and I recommend it.

The Pragmatic Engineer deepdives relevant for this episode

Applied AI Software Engineering: RAG

How do AI software engineering agents work?

AI Tooling for Software Engineers in 2024: Reality Check

IDEs with GenAI features that Software Engineers love

Timestamps

(00:00) Intro

(01:31) A quick overview of AI Engineering

(06:45) How Chip ensured her book stays current amidst the rapid advancements in AI

(11:35) A definition of AI Engineering and how it differs from Machine Learning Engineering

(18:15) Simple first steps in building AI applications

(24:38) An explanation of BM25 (retrieval system)

(25:28) The problems associated with fine-tuning

(29:40) Simple customer support solutions for rolling out AI thoughtfully

(35:29) Chip’s thoughts on staying focused on the problem

(37:04) The challenge in evaluating AI systems

(40:03) Use cases in evaluating AI

(43:09) The importance of prioritizing users’ needs and experience

(48:09) Common mistakes made with Gen AI

(53:57) A case for systematic problem solving

(54:57) Project-based learning vs. structured learning

(1:00:07) Why AI is not the end of engineering

(1:04:56) How AI is helping education and the future use cases we might see

(1:08:58) Rapid fire round

A summary of the conversation What is AI engineering?

AI engineering is a shift from machine learning to more engineering and product-focused work.

Previously, building machine learning applications required creating custom models, necessitating data and expertise in model training. Now, AI models can be used simply by making API calls. This lowers the barrier to entry and eliminates the need for specialized ML/AI degrees or to have large amounts of data collected.

Machine learning engineers did a lot of preparation work. They traditionally focused on data collection, annotation, and model training before deployment. AI engineering, in contrast, starts with a product idea and then progresses to data gathering and working with models.

AI Engineering is a new term. It emerged because the role is different from traditional Machine Learning Engineering. Chip surveyed people building applications on top of foundation models, and 'AI Engineering' was the most common way to describe what they did – so she stuck with this name!

A key difference compared to ML engineering: AI engineers often begin with readily available APIs or pre-trained models. This means they can start to prototype and iterate early on. They will only consider building custom models much, much later! See how e.g. Perplexity started to build their own AI models after their product was launched as the first version.

Ways to build AI applications

Start with basics. Understand what a “good response” and a “bad response” means. To do so, you need to set guidelines and evaluate responses. Use both automation and manual review.

Retrieval-augmented generation (RAG) is a commonly used pattern for adding context to model responses. RAG involves retrieving relevant documents or information to augment the model's context. We previously did a deepdive on RAG.

Keyword retrieval. Many equate RAG with vector search, but a simpler approach may be more effective initially. Try keyword retrieval before doing vector searches: as vector databases can be costly and may obscure specific keywords.

Data preparation and chunking are important to get right for RAG. Techniques such as extracting metadata or keywords from documents can significantly improve performance, potentially more than focusing on the database itself.

Fine-tuning: often a last resort and many teams never do it. This is due to the complexities of hosting and maintaining a fine-tuned model. The rapid pace of model development also makes it difficult for fine-tuned models to maintain their advantage over time.

One step at a time. Take a structured approach: starting with prompting, adding data, and then move to more complex methods if needed. Understand that the problem being solved is more important than using the latest technology.

Resist the urge to ship only incremental improvements. Building out new use cases is usually far more beneficial.

Practical approaches and avoiding pitfalls

Start with a human in the loop. After that, gradually increase automation as confidence in the AI solution grows. A phased rollout, starting with internal use cases, can help to mitigate risk.

Focus on solving the problem, not just using AI. Apply the simplest solutions to remove roadblocks.

Common mistakes:

Using GenAI when it’s not needed. Some problems may be better solved with simpler solutions like spreadsheets or manually scheduling tasks.

Abandoning GenAI when it doesn’t seem to work, without pinpointing where the process fails. Reasons for GenAI not (yet) working could be in data extraction or text processing. Don’t give up too easily!

Jumping into complex solutions like vector databases or fine-tuning without addressing simpler approaches.

Be careful about using untested or rapidly changing AI frameworks because they can introduce bugs and issues.

Evaluating AI systems

A difficult area, because as AI becomes more sophisticated, it is harder to assess the output. A convincing summary might not be a good summary, and a math solution from AI can be hard to evaluate if one does not know the answer.

There is no single, simple solution to evaluation. Instead, a combination of methods is required and talking to users, analysing their interactions and manual data inspections are essential.

Measure what matters. Sometimes, what seems intuitive isn’t what users actually want, for example, a meeting summarisation tool focusing on action items instead of overall content.

Do not skip the hard and manual work! Manual data inspection is critical and still offers the highest ratio of value

Human evaluations are important for seeing how users are actually using the product and to correlate with automated metrics.

Learning AI Engineering

Two ways of learning:

Project-based learning: pick up a project and working through all the problems.

Structured learning: taking courses or reading books.

Tutorials are great but it's important to stop and ask questions. Without that, tutorials can lead to mindlessly copying code without understanding it.

Combine project-based learning with structured learning. Also, observe your own workflow and see what parts can be automated by AI to get use case ideas.

Coding is not the same as software engineering. Software engineering is about solving problems, and while AI can automate coding, it cannot fully automate the problem-solving process.

AI can enable software engineers to build much more complex software.

AI will not eliminate software engineering because it can only automate part of the software engineering process. The need for precision in software development requires someone who understands the nuances of code.

Resources & Mentions

Where to find Chip Huyen:

• X: https://x.com/chipro

• LinkedIn: https://www.linkedin.com/in/chiphuyen/

• Website: https://huyenchip.com/

Mentions during the episode:

• AI Engineering: Building Applications with Foundation Models: https://www.amazon.com/AI-Engineering-Building-Applications-Foundation/dp/1098166302/r

• DeepSeek: https://www.deepseek.com/

• Multimodality and Large Multimodal Models (LMMs): https://huyenchip.com/2023/10/10/multimodal.html

• BM25: https://en.wikipedia.org/wiki/Okapi_BM25

• Microsoft Bot Framework: https://dev.botframework.com/

• Python: https://www.python.org/

• Javascript: https://www.javascript.com/

• ChatGPT: https://chatgpt.com/

• Claude: https://claude.ai/

• Llama: https://www.llama.com/

• Complex Adaptive Systems: An Introduction to Computational Models of Social Life (Princeton Studies in Complexity): https://www.amazon.com/Complex-Adaptive-Systems-Introduction-Computational/dp/0691127026/

• The Selfish Gene: https://www.amazon.com/Selfish-Gene-Anniversary-Landmark-Science/dp/0198788606/

• Antifragile: Things That Gain from Disorder: https://www.amazon.com/dp/0812979680/

Production and marketing by Pen Name. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com.


Simon Willison

Gemini 2.0 is now available to everyone

Gemini 2.0 is now available to everyone Big new Gemini 2.0 releases today: Gemini 2.0 Pro (Experimental) is Google's "best model yet for coding performance and complex prompts" - currently available as a free preview. Gemini 2.0 Flash is now generally available. Gemini 2.0 Flash-Lite looks particularly interesting: We’ve gotten a lot of positive feedback on the price and speed of 1.

Gemini 2.0 is now available to everyone

Big new Gemini 2.0 releases today:

Gemini 2.0 Pro (Experimental) is Google's "best model yet for coding performance and complex prompts" - currently available as a free preview. Gemini 2.0 Flash is now generally available.

Gemini 2.0 Flash-Lite looks particularly interesting:

We’ve gotten a lot of positive feedback on the price and speed of 1.5 Flash. We wanted to keep improving quality, while still maintaining cost and speed. So today, we’re introducing 2.0 Flash-Lite, a new model that has better quality than 1.5 Flash, at the same speed and cost. It outperforms 1.5 Flash on the majority of benchmarks.

That means Gemini 2.0 Flash-Lite is priced at 7.5c/million input tokens and 30c/million output tokens - half the price of OpenAI's GPT-4o mini (15c/60c).

Gemini 2.0 Flash isn't much more expensive: 10c/million for text/image input, 70c/million for audio input, 40c/million for output. Again, cheaper than GPT-4o mini.

I pushed a new LLM plugin release, llm-gemini 0.10, adding support for the three new models:

llm install -U llm-gemini llm keys set gemini # paste API key here llm -m gemini-2.0-flash "impress me" llm -m gemini-2.0-flash-lite-preview-02-05 "impress me" llm -m gemini-2.0-pro-exp-02-05 "impress me"

Here's the output for those three prompts.

I ran Generate an SVG of a pelican riding a bicycle through the three new models. Here are the results, cheapest to most expensive:

gemini-2.0-flash-lite-preview-02-05

gemini-2.0-flash

gemini-2.0-pro-exp-02-05

Full transcripts here.

I also ran the same prompt I tried with o3-mini the other day:

cd /tmp git clone https://github.com/simonw/datasette cd datasette files-to-prompt datasette -e py -c | \ llm -m gemini-2.0-pro-exp-02-05 \ -s 'write extensive documentation for how the permissions system works, as markdown' \ -o max_output_tokens 10000

Here's the result from that - you can compare that to o3-mini's result here.

Tags: gemini, llm, google, generative-ai, llm-pricing, ai, llms, pelican-riding-a-bicycle


Wrench in the Gears

Follow Up On Quantum Realism In Connection With Gordon Pask’s Mechanical Theatricality – Plus A “Diamond Age” Excerpt Read Aloud

This is a follow up to yesterday’s video reiterating my thoughts about Live Action Role Play with encoded artefacts as an example of extended mind and possible future inter-dimensional information processing system. The talk goes along with two earlier talks Jason and I did on the work of Michael Mateas here. The feature image is [...]

This is a follow up to yesterday’s video reiterating my thoughts about Live Action Role Play with encoded artefacts as an example of extended mind and possible future inter-dimensional information processing system. The talk goes along with two earlier talks Jason and I did on the work of Michael Mateas here. The feature image is taken from Paul Pangaro’s online archive of Pask materials that is very much worth exploring here.

I am including a few maps if you want to poke around. All maps are best viewed on a laptop.

One features Pask’s cybernetic (an synesthetic) endeavors. Click here for the interactive version – enter “Pask” in the search barn top to the left if needed to zoom in again.

https://embed.kumu.io/1e710f6678f0a88ace5a8d3fae2e9af7#untitled-map?s=bm9kZS10bXlnWlROWQ%3D%3D

One shows Jacobo Moreno and Timothy Leary’s work in psychology and group dynamics through role play. Click here for the interactive version – enter “Moreno” in the search barn top to the left if needed to zoom in again.

https://embed.kumu.io/68e1cf46a34811b86499685100a83863#untitled-map?s=bm9kZS1zOW9nM0VjQg%3D%3D

One shows Mateas, LARPing, and non-traditional materials that compute. Click here for the interactive version – enter Mateas in the search barn top to the left if needed to zoom in again.

https://embed.kumu.io/174960fb41354356ac833be7f75f6fad#untitled-map?s=bm9kZS1YY1VlZ1hLeA%3D%3D

One shows distributed cognition with respect to context and encoded artifacts. Click here for the interactive version – enter “Distributed Cognition” in the search barn top to the left if needed to zoom in again.

https://embed.kumu.io/2a9737e61e74d617d28bfdf81d54a1cf#untitled-map?s=bm9kZS1OYUR5NHE4aQ%3D%3D

One shows DAO voting, which I believe relates to collapsing waves in Quantum Potential. Click here for the interactive version – enter “Extended Mind” in the search barn top to the left if needed to zoom in again.

https://embed.kumu.io/0932741381c995440ed1891fc8d992b7#untitled-map?s=bm9kZS1MckliZmRyTA%3D%3D

Enjoy!

 

 


Webistemology - John Wunderlich

Growing Democracy

There is a place, little noticed and even less understood by most Canadians, where democracy is growing stronger despite a history of violence and exploitation.
The People of DRCongo Shed Light on the “Heart of Darkness” as Our Media Looks the Other Way

Steve Soloman

Keeping up with world events, important as it is, can nonetheless be fraught. Any Canadian who wishes to stay informed has seen the coverage: autocratic governments flexing their muscles, personality cults rather than responsible governments and the rule of law being threatened even in well-established democracies. The internal rot of misguided populism and hyper partisanship, challenging as they are, are only the first steps along a very rocky road.

Despite what we see and hear, the news is not all bad. There is a place, little noticed and even less understood by most Canadians, where democracy is growing stronger despite a history of violence and exploitation.

When we think of Africa - if we think of it at all - we picture a continent struggling to overcome its colonial past. Tragically for the people who live there, this image is not entirely wrong; Africa has seen nine military coups since 2020. But like elsewhere youth in Africa are the continent’s hope, and Africa is a very young continent, with an average age in 2023 of just under nineteen. Education levels are rising though progress has been slow, due in part to a largely indifferent West. In the Democratic Republic of Congo (DRC), despite almost innumerable setbacks, positive strides are being made and there are valuable lessons to be learned if only we would take note. DRC and its people are working to overcome deplorable handicaps to create a government that is, to borrow a phrase, by the people and for the people.

When DRC does get mentioned, it’s because of either disease or warfare. This is largely due to the nature of the media beast; both what it is and what it isn’t. The media consists of profit-seeking businesses, and like any business it provides us with what it knows, or at least believes, we want. What it’s not is a pacifier ready to soothe our jangled nerves. Simply put, bad news sells. The news we get out of DRC is very much a case in point. There are positive developments, but we won’t read about this in mainstream media, not because of any sort of cover-up but rather out of rational business decisions. Thus democracy is evolving in the midst of a continent deserving of far more attention than it gets.

But before we can get to those positives we need to illustrate the circumstances out of which they have grown. This is really the only way we can appreciate what the people of DRC have suffered and continue to endure in their ongoing and heroic attempts to free themselves and achieve what so many of the rest of us take for granted.

DRC lies at the very centre of the African continent. With an area of 2.3 million square kilometres, it’s the second largest African nation after Algeria. It became a Belgian colony in 1908 but Europeans had ‘discovered’ the country well before that; in 1890 Joseph Conrad spent time there and later used it as the setting for Heart of Darkness. The book is a critique both of Belgian exploitation and an attitude toward Congo, and by extension all of Africa, that persists to this day.

Many of us continue to think of Africa as if it was one big country when in fact it contains fifty-four fully independent nations. Ethnically and culturally it’s at least as diverse as any other continent. In terms of its natural resources, DRC is amongst the wealthiest nations on Earth but its people are amongst the poorest. The Belgian colonizers exploited the country’s rubber reserves and more recently many nations, including Canada, have taken advantage of DRC’s vast quantities of copper, cobalt, zinc, silver, gold, bauxite, uranium and coltan. In all, minerals account for 90 per cent of DRC’s exports. To illustrate just one example, eighty per cent of the world’s supply of coltan comes from DRC and that mineral is in all the cell phones and laptops you and I use every day.

Much of the mining in DRC is referred-to as ‘artisanal’, a euphemism which when put into plain English, means men and children, typically boys, working as miners with no safety gear and little or no equipment of any kind beyond pickaxes. The coltan in our phones and computers has very likely come from one of these ‘artisanal’ mining operations. This is in no way meant to suggest we shouldn’t own these things, only that we should be aware; that we should be ready to take positive steps to give back in some way; to end the exploitation.

DRC officially gained its independence from Belgium on June 30th 1960. But the country was utterly unprepared to run its own affairs since the Belgians had taken no steps to prepare their former colonials for the challenges of self-rule. A republican government was established, with both a President and Prime Minister. Patrice Lumumba, a leader of the independence movement, and the country’s first Prime Minister, was both an African nationalist and a member of the pan-African movement. This meant he supported national self-determination and believed in strengthening bonds between all indigenous Africans and the global African diaspora. He played a major role in taking the nation from colony to independence. But his left-of-centre political views, in the midst of the Cold War, made him dangerous in the eyes of several western nations, particularly the United States.

The independence celebrations were short-lived; within a very few days the new nation was faced with what came to be known as the Congo Crisis. With Belgian support the provinces of Katanga and South Kasai seceded and there was also a revolt by members of the military as Black soldiers were no longer willing to accept orders from white officers. The United Nations was preparing to send troops to aid the government against the secessionists but the Secretary-General, Dag Hammarskjold, made a last minute decision not to do so. In response, Lumumba called for assistance from the Soviet Union, which sent ‘military advisors’. The Soviet presence caused a split in the government between Lumumba and the President, Joseph Kasa-Vubu. It was at this point that Joseph-Désiré Mobutu Sese-Seko, a lieutenant-colonel and Lumumba’s chief military advisor, stepped in to break the deadlock by leading a coup-d’etat. Supported by both Belgium and the United States, he expelled the Soviets and established a government that supported his view of Lumumba as a danger to the country. He led a second coup in 1965 that made himself President. About 100,000 people died during this period of political conflict, including Lumumba, who was assassinated on January 17, 1961. Mobutu ruled from 1965 to 1997. In 1971 he changed the country’s name to Zaire. His thirty-year rule also saw European investors expelled and the nationalizing of their companies. He looted the nation’s treasury. In 1990 he agreed to end the ban he had imposed on a multi-party state and formed a transitional government while retaining considerable political power. In 1994 he agreed to the appointment of Kengo Wa Dondo, an advocate for free-market reforms, as Prime Minister.

That same year in neighbouring Rwanda, between April and July, armed groups consisting of Hutu fighters committed genocide against the Tutsi people. Over 800,000 victims, mostly Tutsis, were killed and the violence spread into DRC as small groups of Hutu extremists formed militia groups. Such militias continue to operate in DRC to this day, funded largely by the Rwandan government under the leadership of Paul Kagame. Kagame, who is Tutsi, was a commander of a militia that sought, successfully, to put an end to the genocide. However he later initiated two rebel wars in DRC, the First Congo War, from 1996 to 1997, immediately followed by the Second Congo War of 1998 to 2003. Among horrific atrocities committed was the serial rape and murder of Congolese women. Canada’s Trudeau government, usually so ready to condemn such acts, has remained entirely silent about the continuing actions of the militia groups in DRC. As recently as July 13th of 2024, seventy-two people were killed by armed militia in a village about one hundred kilometres east of the nation’s capital, Kinshasa.

While the equally horrific acts occurring in Ukraine are breathlessly reported within hours, this slaughter in DRC was reported in a Toronto newspaper four days after the event and was buried at the bottom of page five.

Mineral wealth provides the incentive for the ongoing violence. Recent reports indicate that Rwanda’s national army, somewhat ironically named the Rwandan Defence Force, has deployed as many as four thousand soldiers to DRC. Their mission appears to be aiding the militia groups and facilitating the smuggling of minerals into Rwanda. In late January of this year, Rwanda sent even more troops into DRC, in numbers that indicated full-out invasion. Along with a militia group called M23, they captured the capital of DRC’s eastern region, Goma.

Since the 1990s Rwandan-backed militias have taken control of coltan and other mineral-mining operations. The Congolese government estimates that it loses $1 billion (U.S.) each year because of these illegal activities. The Congolese army is fighting to halt these operations but finds itself facing an enemy that is very well armed with modern weapons such as so-called mortar drones. Desertion is also a significant factor, as the militia groups pay their soldiers better wages than DRC can pay. People continue to be forced to flee their homes, a harsh reality in eastern DRC for almost thirty years. This violence has also resulted in severe food shortages as farms are abandoned in areas controlled by militia groups. The only realistic way to put an end to all this would be to confront the Rwandan government directly, with military force if needed. Why this has gone on all this time with no response from the West is a matter for conjecture, though it’s by no means far-fetched to suggest having access to relatively cheap minerals plays a role. A case can also be made that Western governments feel no pressure to act because their citizens are uninformed by media.

DRC has also been plagued by outbreaks of disease. In 1976 the country made headlines due to the outbreak of Ebola, a disease named after a tributary of the Congo River. The Ebola virus was first detected in a village about 110 kilometres from the river but it was decided to name the disease Ebola in order to avoid having the village become forever known as the starting point of this often - though not always - fatal pathogen. There have been subsequent outbreaks that have killed thousands of people. Work to create vaccines began to make progress in 2014. The World Health Organization did much to speed the process and an effective vaccine began to be used in 2016. Recent outbreaks have been quickly brought under control.

There are organizations in the West who lay claim to knowledge of the situation in DRC. Freedom House, a Washington D.C.- based non-profit that assesses how democratic various nations are, continues to list DRC as “not free”. However, a review of current trends suggests that the Freedom House ranking should be reviewed.

In January 2019, when Felix Tshisekedi became President, it marked the first time since independence that the presidency was peacefully transferred from one political party to another. Tshisekedi freed seven hundred political prisoners jailed by his predecessor. Understanding that education is vital to building a free and democratic state, he also invested heavily in schools for DRC’s next generation.

In December 2023 he faced another election, as the country’s constitution requires. Western media tended to focus on claims of voting irregularities that originated mostly with defeated candidates. Voter turnout was 48.8 per cent despite three provinces being unable to safely participate due to ongoing militia violence. That figure marks a small but important increase from the prior election in 2018 when 45.5 per cent of eligible voters cast a ballot (voter turnout in the Ontario election of 2022 was forty-three per cent). Tshisekedi was re-elected by a comfortable margin and reports from well-informed sources within DRC state that the majority of Congolais believe the election was run fairly, the results are credible and that the re-elected President had achieved some positive change that ordinary citizens could see for themselves. The really crucial point is this: if the majority of people in DRC see themselves as living in a democracy, how legitimate can it be for outsiders like Freedom House to claim otherwise?

DRC is just one of many countries that has had a long, hard struggle to achieve democracy, and the hard work of maintaining what has been won never stops. The people of that nation have overcome crushing obstacles, and the progress they’ve made continues to be under siege by those both in Africa and abroad who have very different interests.

Democracy doesn’t just happen by itself. It must be fought for and constantly supported by an informed and active electorate. The people of DRC have learned this vital lesson while too many of us in the West appear to have forgotten it.


Mike Jones: self-issued

Proposed Candidate Recommendation for Controlled Identifiers

The W3C Verifiable Credentials Working Group has published a Snapshot Candidate Recommendation of the Controlled Identifiers specification. This follows the five Candidate Recommendation Snapshots published by the working group in December 2024. Two of these specifications, including Securing Verifiable Credentials using JOSE and COSE, depend upon the Controlled Identifiers spec. The planned update

The W3C Verifiable Credentials Working Group has published a Snapshot Candidate Recommendation of the Controlled Identifiers specification. This follows the five Candidate Recommendation Snapshots published by the working group in December 2024. Two of these specifications, including Securing Verifiable Credentials using JOSE and COSE, depend upon the Controlled Identifiers spec. The planned update to the W3C DID specification also plans to take a dependency upon it.

A W3C Candidate Recommendation Snapshot is intended to become a W3C Candidate Recommendation after required review and approval steps.

Thanks to my co-editor Manu Sporny and working group chair Brent Zundel for their work enabling us to reach this point.


Simon Willison

o3-mini is really good at writing internal documentation

o3-mini is really good at writing internal documentation I wanted to refresh my knowledge of how the Datasette permissions system works today. I already have extensive hand-written documentation for that, but I thought it would be interesting to see if I could derive any insights from running an LLM against the codebase. o3-mini has an input limit of 200,000 tokens. I used LLM and my files-to-p

o3-mini is really good at writing internal documentation

I wanted to refresh my knowledge of how the Datasette permissions system works today. I already have extensive hand-written documentation for that, but I thought it would be interesting to see if I could derive any insights from running an LLM against the codebase.

o3-mini has an input limit of 200,000 tokens. I used LLM and my files-to-prompt tool to generate the documentation like this:

cd /tmp git clone https://github.com/simonw/datasette cd datasette files-to-prompt datasette -e py -c | \ llm -m o3-mini -s \ 'write extensive documentation for how the permissions system works, as markdown'

The files-to-prompt command is fed the datasette subdirectory, which contains just the source code for the application - omitting tests (in tests/) and documentation (in docs/).

The -e py option causes it to only include files with a .py extension - skipping all of the HTML and JavaScript files in that hierarchy.

The -c option causes it to output Claude's XML-ish format - a format that works great with other LLMs too.

You can see the output of that command in this Gist.

Then I pipe that result into LLM, requesting the o3-mini OpenAI model and passing the following system prompt:

write extensive documentation for how the permissions system works, as markdown

Specifically requesting Markdown is important.

The prompt used 99,348 input tokens and produced 3,118 output tokens (320 of those were invisible reasoning tokens). That's a cost of 12.3 cents.

Honestly, the results are fantastic. I had to double-check that I hadn't accidentally fed in the documentation by mistake.

(It's possible that the model is picking up additional information about Datasette in its training set, but I've seen similar high quality results from other, newer libraries so I don't think that's a significant factor.)

In this case I already had extensive written documentation of my own, but this was still a useful refresher to help confirm that the code matched my mental model of how everything works.

Documentation of project internals as a category is notorious for going out of date. Having tricks like this to derive usable how-it-works documentation from existing codebases in just a few seconds and at a cost of a few cents is wildly valuable.

Tags: llm, openai, o3, ai, llms, datasette, generative-ai, documentation, ai-assisted-programming, inference-scaling


Ambsheets: Spreadsheets for exploring scenarios

Ambsheets: Spreadsheets for exploring scenarios Delightful UI experiment by Alex Warth and Geoffrey Litt at Ink & Switch, exploring the idea of a spreadsheet with cells that can handle multiple values at once, which they call "amb" (for "ambiguous") values. A single sheet can then be used to model multiple scenarios. Here the cell for "Car" contains {500, 1200} and the cell for "Apartment"

Ambsheets: Spreadsheets for exploring scenarios

Delightful UI experiment by Alex Warth and Geoffrey Litt at Ink & Switch, exploring the idea of a spreadsheet with cells that can handle multiple values at once, which they call "amb" (for "ambiguous") values. A single sheet can then be used to model multiple scenarios.

Here the cell for "Car" contains {500, 1200} and the cell for "Apartment" contains {2800, 3700, 5500}, resulting in a "Total" cell with six different values. Hovering over a calculated highlights its source values and a side panel shows a table of calculated results against those different combinations.

Always interesting to see neat ideas like this presented on top of UIs that haven't had a significant upgrade in a very long time.

Via Hacker News

Tags: spreadsheets, geoffrey-litt, ui, ux


AI-generated slop is already in your public library

AI-generated slop is already in your public library US libraries that use the Hoopla system to offer ebooks to their patrons sign agreements where they pay a license fee for anything selected by one of their members that's in the Hoopla catalog. The Hoopla catalog is increasingly filling up with junk AI slop ebooks like "Fatty Liver Diet Cookbook: 2000 Days of Simple and Flavorful Recipes for a

AI-generated slop is already in your public library

US libraries that use the Hoopla system to offer ebooks to their patrons sign agreements where they pay a license fee for anything selected by one of their members that's in the Hoopla catalog.

The Hoopla catalog is increasingly filling up with junk AI slop ebooks like "Fatty Liver Diet Cookbook: 2000 Days of Simple and Flavorful Recipes for a Revitalized Liver", which then cost libraries money if someone checks them out.

Apparently librarians already have a term for this kind of low-quality, low effort content that predates it being written by LLMs: vendor slurry.

Libraries stand against censorship, making this a difficult issue to address through removing those listings.

Sarah Lamdan, deputy director of the American Library Association says:

If library visitors choose to read AI eBooks, they should do so with the knowledge that the books are AI-generated.

Via Emily Bell

Tags: slop, ai, ethics, libraries

Tuesday, 04. February 2025

The Pragmatic Engineer

Cross-platform mobile development with Expo

Expo has quietly become one of the most popular tools for cross-platform development. Why is this, and how does Expo work? A deepdive with the Expo engineering team

React Native and Flutter are by far the most popular cross-platform development technologies, as previously covered in the deepdive, Cross-platform mobile development. React Native (made by Meta) seems more popular in the US and UK, and at mid to large-sized companies, while Flutter (made by Google) powers more apps overall, globally.

But there’s also one other company that consistently comes up in talk about cross-platform mobile technologies: Expo. When the Bluesky team shared their cross-platform tech stack, they mentioned that it was React Native and Expo. Meanwhile, the React Native “getting started” guide strongly advises starting development with Expo, and calls it “a production-grade React Native Framework.”

So, what is Expo and why is it so important for cross-platform React Native development?

A year ago, I travelled to California and visited Expo’s offices in Palo Alto, meeting with the team to find out more; a few months later, I got back in touch for extra details on how Expo works, and to learn what makes it popular. This deepdive covers:

Why is Expo popular? It played a large role making React Native as widespread as it is today, and its developer focus has been a big factor.

History. Expo began in dissatisfaction about the clunkiness of developing and shipping mobile apps. Today, a mission to make mobile development as seamless as web dev still motivates the team.

React Native vs a React Native framework. Expo is a React Native framework, and the only production-ready one, to date.

Expo: developer tools. Overview of tools Expo offers as open source, for free.

Shipping to production with EAS. Expo generates revenue with its Expo Application Services (EAS): a freemium remote build and submission service. An overview of its wide-ranging capabilities.

Alternatives for shipping cross-platform apps. Ionic Appflow, Shoebird, Fastlane, Bitrise, Codemagic.

This deepdive features contributions from Expo’s co-founder and CEO Charlie Cheever, co-founder and CTO James Ide, and engineering manager Brent Vatne. Their input is appreciated!

As with every deepdive we publish, The Pragmatic Engineer has no commercial affiliation with the company featured, and was not paid to write about them in this article. More in our ethics statement.

1. Why is Expo popular?

Expo is a startup based in Palo Alto, California, employing 32 people, 29 of whom are developers – a surprisingly high ratio of 90%. The company lists employees on its about page. The ratio at larger software companies like Meta, Google, or Microsoft is rarely above 30-40%, due to the presence of other teams like sales, support, and many other support functions.

Today, the open source Expo framework is used in more than 1 million public Github repositories, and has more than 800,000 downloads per week. The project is very active: more than 1,300 developers contribute in total, and there were more than 50,000 members of the Discord channel at time of publication.

Expo may be the biggest reason why React Native became as popular as Flutter. These days, when I hear a company develops with React Native, it almost always follows that they use Expo, too. This is because it has developer tools which make even the best cross-platform framework much less clunky to develop on for the iOS and Android native platforms.

For Flutter, similarly solid developer tooling which Google built and maintained has long been a differentiator; the search giant built sophisticated tools to debug, profile and diagnose Flutter applications. However, getting started with Flutter still requires a multi-step developer environment setup, while shipping iOS and Android apps is a pretty involved process, compared with the simplest ways of getting started with React Native and Expo:

Just open the website snack.expo.dev.

This is the site called “Expo Snack,” where you can immediately start to do live editing on a React Native application that runs on an Android emulator, an iOS emulator, or a web view:

An app built in two minutes without installing anything, running on a remote iOS simulator. Image: Expo Snack

You can also run the app on your mobile device by scanning a QR code and installing the Expo Go app. I also did this – and the app live updated as I made changes. With such little effort required, this felt magical.

A React Native app running on my phone as a native app, after a few minutes of coding and zero dev environment setup

Being able to live edit an app with zero development tooling feels magical, even in 2025. Developing with no tools installed locally is not possible when doing native iOS or Android development. Google’s cloud development environment, Project IDX, comes close, but it’s not as snappy. Meanwhile, Expo Snack is just one of many developer-first features produced by Expo’s small team, that’s on a par with Apple and Google tools. More below on Expo’s development tooling in the sections, “What is Expo?” and “Shipping to production.”

Why do companies move to Expo? We asked the Bluesky engineering team, who are heavy React Native users – and whom we previously covered in the deepdive Inside Bluesky’s engineering culture. Here’s what Paul Frazee – the dev who built the initial Bluesky’s iOS, Android, and web applications, using a single codebase – told us:

“We were initially manually building the app using Xcode and Android Studio, then uploading the builds.

The reason we first looked at Expo was to help us with deployments. Expo has a really solid cloud-build infrastructure (EAS Build) and over-the-air update system (EAS Update). We looked at the other React Native cloud build services and none of them seemed as well built, so we took a chance with Expo. We found ourselves iteratively adopting expo modules, and then moved fully into their framework. My only regret is that we didn’t start with it. It’s not often that I have such a positive experience.

We started migrating over to Expo around 9 months into app development, around February 2023. This was when we had a pretty mature app. It was an iterative migration that went pretty smoothly.”

2. History

As previously mentioned, Expo was founded out of frustration at how long it takes to build a mobile app. In 2012, Charlie Cheever was CTO at Quora and working on an iOS and Android app for the Q&A site. Being a hands-on developer with extensive web experience (he was an early engineer at Facebook), he assumed it would take at most a few weeks to build each native app. However, it actually took months per platform, and included terrible developer experiences on both iOS and Android; at least, compared to the web. From this came the idea for Expo. Charlie sums up the objective:

“The web is a pretty good way of doing development and it’s so powerful. Let’s take everything that’s good about the web and web development and make the mobile development space better.”

He teamed up with James Ide and they started experimenting with ways of using web technologies to build mobile apps, eventually ending up with a framework that wrapped mobile-native components; basically something pretty similar to React Native. They were about to launch when React Native was released, which was created and used by Meta, with around 40 people working on it. In contrast, the as yet-unreleased Expo framework was developed by two people.

Rather than compete with React Native, Expo decided to embrace it. Charlie and James understood that coding an app was just one of the many parts of shipping a cross-platform app. Every developer using React Native still had other problem areas to solve:

Distribution to the iOS and Android App Stores

Release strategies

CI pipelines

Improving the developer experience

“Exponent” was the first product the team shipped; a React Native development tool that sat a level above React Native. The team announced it in 2015 at a React conference called React Rally. It took care of things like:

Dealing with Xcode and Android Studio

Provisioning profiles for iOS

Development and deployment certificates

Any Objective-C, Swift and Java-specific things

Sketch of the explanation used to introduce Exponent in 2015

The Exponent toolset kept evolving, and two years later in 2017, the name changed to Expo because it’s easier to say and remember. From 2017, Expo kept shipping new, major releases to Expo SDK several times per year. Back then, Expo’s SDK was on version 21; today, it’s on version 52. The company maintains a changelog listing notable changes, and a detailed changelog.

3. React Native vs a React Native Framework

Expo is a framework and set of tools that uses React Native (RN), and simplifies the development of RN applications. It also happens to be the only React Native Framework considered as production-grade, currently. But what is a “React Native Framework?” As React Native builds on top of React, let’s see how things work in the React world.

React vs React Frameworks

React: providing core capabilities. React is a JavaScript library for building web user interfaces.

React code using the markup called JSX, a JavaScript syntax extension popularized by React which supports HTML-like rendering logic. Source: React

React takes care of many things, like:

High performance: using a Virtual DOM, React creates an in-memory data structure cache of the UI, computes the resulting differences, and updates the browser’s DOM only as needed. This process is called reconciliation and produces better performance

JSX: support for the JSX syntax extension to allow using HTML-like code within JavaScript

Reusable UI components: the core of React is reusable components. React specifies how to define components, their lifecycle events, nesting, communicating between them, rendering differently based on different conditions, and more

Handling events like clicking, hovering, keyboard inputs, swiping, etc.

UI state management: an opinionated way to manage state. Instead of manipulating individual parts of the UI, React uses a declarative approach

Other features like support for TypeScript, React Hooks (using React features from components), and support for server components, among others.

The simplest React component. The React framework specifies how to create these, and has support for even complex use cases for components. Source: React

React Frameworks: providing production-grade capabilities. You can build web applications only using React. However, the more complex and larger an application gets, the more additional features must be built, like routing between pages, code-splitting between several files, fetching data, moving some rendering to the server, and so on. Here’s what the React team says about frameworks in the getting started guide:

“You can use React without a framework, however we’ve found that most apps and sites eventually build solutions to common problems such as code-splitting, routing, data fetching, and generating HTML. These problems are common to all UI libraries, not just React.

By starting with a framework, you can get started with React quickly, and avoid essentially building your own framework later.”

The React team lists three “production-grade” frameworks, defined by:

Support for all features needed to deploy and scale apps in production

Open source

Can be deployed to a personal server or a hosting provider

React Frameworks which meet these conditions, and are recommended by the React team:

Next.js: the Pages Router of Next.js is a full-stack React framework. It’s versatile and lets you create React apps of any size, from a mostly static blog, to a complex dynamic application. Maintained by Vercel.

Remix: a full-stack React framework with nested routing. It lets you break an app into nested parts that can load data in parallel and refresh in response to user actions. Maintained by Shopify.

Gatsby: a React framework for fast CMS-backed websites. Its rich plugin ecosystem and GraphQL data layer simplify integrating content, APIs, and services into one website. Maintained by Netlify.

React is funded by Meta, but React frameworks are funded by other companies. Vercel and Netlify are hosting providers and fund the frameworks as it helps drive adoption of their services. Remix was acquired by Shopify, and is being merged into React Router.

React Native vs a React Native Framework

Considering the differences between React and React Frameworks, it’s likely to be unsurprising that React Native has a similar concept split between “core” React Native features, and production-ready features which frameworks should provide.

React Native offers core APIs and capabilities to build native apps. It’s a framework that allows using the React programming paradigm to build native iOS and Android applications, which offers:

A runtime for JavaScript and the React syntax, with the ability to augment it using native iOS and Android extensions

Similar-enough performance to native-only applications

Continual support for modern React features like Suspense (declaratively specifying the loading part of a component tree), Transitions (distinguishing between urgent and non-urgent updates), automatic batching (grouping multiple re-renders into a single re-render for better performance) and others

APIs to invoke native iOS or Android APIs

Hot reloading: during development, see changes made in the code update on the emulator or device in around a second

React Native frameworks provide production-grade capability: The React Native team follows an RFC approach for discussions and proposals. In 2024, the team opened an RFC for React Native frameworks and arrived at the following definition of a RN framework:

“Shipping native apps to production usually requires a set of tools & libraries that are not provided by default as part of React Native, but are still crucial to hit the ship to production. Examples of such tools are: support for publishing applications to a dedicated store, or support for routing and navigation mechanisms.

When those tools & libraries are collected to form a cohesive framework built on top of React Native, we call this a React Native Framework.

A practical example of bespoke React Native Framework is how Meta uses React Native internally. We do have a collection of libraries, scripts and tools that make the integration of React Native easier with the various apps where it's used.”

Expo: the only production-grade open source RN framework available today. At the time of publishing, the React Native team only recommends Expo as a production-grade framework. This is a very strong recommendation, given that the same team also writes this about React Native frameworks:

“You’re either using a React Native Framework or you’re building your own React Native Framework”

So how did Meta end up recommending Expo? Charlie explains:

“I’d clarify the recommendation a tiny bit: the Meta team’s guidance about React Native is that you’re going to need to use some framework with it to build anything real. If you don’t choose an off the shelf one, you’re going to end up having to build one yourself.

Today, Expo is the only very mature React Native framework. However, if tomorrow another team comes along and builds another really good React Native framework, everyone would welcome this.”

Collaboration with Meta started early, as Brent Vatne, engineering manager at Expo, recalls:

“A few of us have been contributors to React Native since the earliest days, the initial preview release at React Conf 2015. We’ve demonstrated through our involvement for nearly 10 years that we are deeply invested in the success of React Native, and can be relied on to collaborate on everything from conference talks to some of the largest initiatives on the open source side of the project.

For example, over the past year we have been working closely with the React Native team at Meta on getting the New Architecture for React Native ready to roll out to open source. We helped prepare the ecosystem by migrating all of our tooling at Expo, using dependency statistics to determine the most impactful third party libraries and helping to migrate them, reporting and/or fixing related issues that we encounter in React Native, educating the community, and many other projects that come out of rolling out such a significant change.”

Expo seems to fill a gap in a way that helps Meta, by making React Native more accessible for external developers. Charlie:

“Meta’s focus is mostly about making React Native work great inside Meta, and ours is to make it work great outside of Meta. Meta doesn’t directly benefit from making React Native open source and most of their team’s focus is on making it work really well for them. So there’s a natural way that we’re able to fit together in a way that everyone wins!”

React Native by itself does not support tools necessary to deploy and scale an app that is production-grade. Meta has its own production-grade React Native framework, which is tailored to its needs and its vast number of custom internal tools. So it’s a win for Meta to have a company building a more generic production framework to build React Native apps with.

4. Expo: collection of developer tools

Expo makes it easier to build production-grade React Native apps by providing additional abstractions (like routing) and tools (like a command line) to make building apps faster. It brings these added features to React Native development, built on top of React Native:

Read more


Patrick Breyer

Halbguter neuer polnischer Vorschlag zur Chatkontrolle wird am Mittwoch diskutiert

Polen, das derzeit den Vorsitz im EU-Rat innehat, schlägt eine wichtige Änderung des viel kritisierten EU-Vorschlags zur Chatkontrolle vor: Anstatt die generelle Durchsuchung privater Chats anzuordnen soll die Chatkontrolle wie …

Polen, das derzeit den Vorsitz im EU-Rat innehat, schlägt eine wichtige Änderung des viel kritisierten EU-Vorschlags zur Chatkontrolle vor: Anstatt die generelle Durchsuchung privater Chats anzuordnen soll die Chatkontrolle wie bisher im Ermessen der Anbieter stehen. Vertreter der EU-Regierungen werden den Vorschlag in der EU-Arbeitsgruppe für Strafverfolgung am Mittwoch beraten.

“Der neue Vorschlag ist ein Durchbruch und im Kampf um unser digitales Briefgeheimnis ein großer Sprung nach vorn”, kommentiert Patrick Breyer (Piratenpartei), ehemaliges Mitglied des Europäischen Parlaments und digitaler Freiheitskämpfer. “Der Vorschlag würde sichere Verschlüsselung schützen und damit die Sicherheit unserer Smartphones. Allerdings blieben drei grundlegende Probleme ungelöst:

1. Massenüberwachung: Selbst wenn Anbieter wie Meta, Microsoft oder Google die Chatkontrolle “freiwillig” praktizieren, ist sie immer noch eine völlig ungezielte und wahllose Massenüberwachung aller privaten Nachrichten über diese Dienste. Nach Angaben der EU-Kommission sind etwa 75 % der Millionen privater Chats, Fotos und Videos, die jedes Jahr von den unzuverlässigen Chatkontrollalgorithmen der Industrie geleakt werden, nicht strafrechtlich relevant und lassen unsere intime Kommunikation in die Hände Unbekannter gelangen, bei denen sie nicht sicher ist und bei denen sie nichts zu suchen hat. Eine ehemalige Richterin des Europäischen Gerichtshofs, Ninon Colneric (S. 34 f.), und der Europäische Datenschutzbeauftragte (Abs. 11) haben gewarnt, dass diese wahllose Überwachung gegen unsere Grundrechte verstößt, selbst wenn sie im Ermessen der Anbieter liegt. In Schleswig-Holstein und Bayern sind bereits zwei Klagen gegen diese Praxis anhängig.

Das Europäische Parlament schlägt einen anderen Ansatz vor: Die Telekommunikationsüberwachung soll verpflichtend angeordnet werden können, aber auf Personen oder Gruppen beschränkt werden, die mit sexuellem Kindesmissbrauch in Verbindung stehen. Das Parlament schlägt außerdem vor, dass die Anbieter ihre Dienste durch eine Reihe von Voreinstellungen, Nachfragen und Warnungen sicherer für junge Menschen gestalten müssen.

2. Digitaler Hausarrest: Nach dem von Polen unverändert vorgeschlagenen Artikel 6 könnten Nutzer unter 16 Jahren künftig zu ihrem eigenen Schutz vor sexueller Annäherung alltägliche Apps aus dem App-Store nicht mehr installieren. Dazu gehören Messenger-Apps wie Whatsapp, Snapchat, Telegram oder Twitter, Social-Media-Apps wie Instagram, TikTok oder Facebook, Spiele wie FIFA, Minecraft, GTA, Call of Duty, Roblox, Dating-Apps, Videokonferenz-Apps wie Zoom, Skype, Facetime. Ein solches Mindestalter wäre leicht zu umgehen und würde Jugendliche bevormunden und isolieren, anstatt sie zu stärken.

3. Verbot anonymer Kommunikation: Nach dem von Polen unverändert vorgeschlagenen Artikel 4 (3) wäre es Nutzern künftig nicht mehr möglich, anonyme E-Mail- oder Messenger-Konten einzurichten oder anonym zu chatten, ohne einen Ausweis oder ihr Gesicht vorzeigen zu müssen, wodurch sie identifizierbar würden und das Risiko von Datenlecks entstünde. Dies würde z. B. sensible Chats zum Thema Sexualität, anonyme Pressekommunikation mit Quellen (z. B. Whistleblowern) sowie politische Aktivitäten einschränken.

Alles in allem ist der polnische Vorschlag im Vergleich zu den bisherigen Texten ein großer Fortschritt in Bezug auf unsere Sicherheit im Internet, aber er erfordert noch gravierende Nachbesserungen. Gleichzeitig geht der Vorschlag wahrscheinlich schon zu weit für die Hardliner-Mehrheit der EU-Regierungen und die EU-Kommission, deren Überwachungshunger bisher stets so extrem war, dass sie die Opfer lieber ohne Einigung ganz im Stich lassen als eine verhältnismäßige, gerichtsfeste und politisch mehrheitsfähige Alternative zur Chatkontrolle zu akzeptieren.”

Monday, 03. February 2025

Doc Searls Weblog

A Hat Tip to United

United Airlines details 6 big inflight entertainment updates, including all-new Control Tower map, by Zach Griff in The Points Guy, is thick with welcome news for frequent United fliers, of which my wife and I are two. (So far I have clocked 1,533,214 miles with United, and she has about double that. We are also […]
The location of just one United Club at O’Hare (ORD), one of United’s main hubs, through which I have transited countless times.

United Airlines details 6 big inflight entertainment updates, including all-new Control Tower map, by Zach Griff in The Points Guy, is thick with welcome news for frequent United fliers, of which my wife and I are two. (So far I have clocked 1,533,214 miles with United, and she has about double that. We are also both lifetime members of the United Club, thanks to her wise investment in a sweet deal offered by United decades ago—and withdrawn quickly after that.)

We are also both T-Mobile customers, which is why seeing this on my phone’s United app is always welcome:

It mostly works. Yes, it’s slow, but it is possible to get work done at altitude at no extra cost. Very nice.

Here’s the rundown of new stuff from United, according to Zach:

New 4K seatback screens “Bluetooth connectivity at every seat, making it possible to pair wireless headphones, including AirPods, with the entertainment systems.” Control Tower View: “This view allows you to pull up a map of the airfield that’s overlaid with all the aircraft around you. Clicking on each little plane brings up its call sign and operating airline… powered by the flight deck’s navigation systems and a live feed of planes from FlightAware.” Starlink is coming. This will be a huge boost in bandwidth for passengers connecting to the Net on flights. Channel 9 (now called “From the Flight Deck”) is still there.

Here is what that last item looked like, the last time I saw the option and shot it on a plane:

Two problems with this. First, it’s buried deep among entertainment options (Zach says it’s under the Audio tab). Second, most, if not all, pilots don’t bother with it. Back in the primitive Channel 9 days, more than half of United pilots turned it on (or left it on, if it was there by default). Or at least that was my experience. Zach:

That said, in recent years, United’s aviation enthusiast customers have complained that Channel 9 hasn’t been working. According to Green, the functionality is still very much available — it’s just that more and more pilots are choosing not to enable the feed.

“Pilots can turn it off if they don’t want people to hear whatever chatter is going on,” Green explained. All hope isn’t lost, though. “We know it’s a cool feature,” Green added. “We have it enabled as much as we can.” (That’s Dominic Green, United’s director of inflight entertainment.)

I suggest that United urge their pilots a bit more forcefully to turn the feature on. Fellow pilots and aviation buffs love having it there. And people afraid of flying also find it comforting to hear pilots sounding totally calm while flying through turbulence.

One more thing. While it would be nice to screen-share the seat-back screen from your own device, there’s this:

Extended displays aren’t coming soon

As more travelers bring their own tablets and laptops on board, I’ve always wondered why airlines don’t offer a screencasting feature on the entertainment screens. After all, it’s a lot easier to get work done with some additional screen real estate.

Turns out, United can already technically enable screencasting today, Green said, but a major issue will likely mean that flyers will never see this functionality.

“We’ve had a lot of pushback from our corporate partners that say they don’t necessarily want their staff using publicly available screens,” Green said.

It’s not that United would have access to any data; it’s that they don’t want nosy passengers snooping in on their neighbors.

Makes sense. But I think United could partner with Apple and Google to create Apple PlanePlay and Google Plane (or whatever names they want to come up with), so at least some personal apps could extend onto the seatback.

Anyway, progress.

 

 

 

 

 


Wrench in the Gears

Alison Reflects on Joseph Gonzalez’s Theory of Quantum Realism – A Hopeful Participatory Universe Built On Love!

Below are Joseph Gonzalez’s responses to questions I posed about his theory of quantum realism. You can find out more at his blog bantamjoe.com. I discuss these questions in the video below starting at timestamp 20 minutes. Select links from Bantamjoe.com: The Self-Aware Circle of Consciousness: Here A Holographic Reality Designed By Consciousness, Designed For [...]

Below are Joseph Gonzalez’s responses to questions I posed about his theory of quantum realism. You can find out more at his blog bantamjoe.com. I discuss these questions in the video below starting at timestamp 20 minutes.

Select links from Bantamjoe.com:

The Self-Aware Circle of Consciousness: Here

A Holographic Reality Designed By Consciousness, Designed For You!: Here

Fractal Dimensions of Consciousness: Here

Love, The First Action of Consciousness: Here

Consciousness From Microtubules to Brain Waves: Here

Simulation Theory Vs. Quantum Realism: Here

Link to Heart Forgiveness website.

Elevator Pitch for Quantum Realism

Quantum Realism proposes that the physical universe is not fundamental but instead a virtual construct generated by a self-sustaining quantum network or primal consciousness. The physical phenomena we observe, including space, time, energy and matter, are outputs of this underlying quantum process, more like a computational system where the quantum field serves as the substrate for reality. Unlike classical realism, which treats matter as independent and static, Quantum Realism views the universe as a dynamic, evolving system, continuously shaped by observation and interaction.

This theory aligns with the strange findings of modern physics, such as quantum superposition, entanglement, and the observer effect. These phenomena indicate that reality behaves more like information being processed than a collection of static objects. By grounding itself in the principles of quantum mechanics and information theory, Quantum Realism provides a framework that explains physical laws but also integrates consciousness as an active participant in the unfolding of reality. 

How Quantum Realism Differs from Other Theories

Quantum Realism stands apart by uniting consciousness and matter within a single quantum substrate. Classical physics focuses on deterministic interactions of matter, while dualism separates consciousness from the material. Quantum Realism suggests that matter emerges from the consciousness field, which acts as a self-generative, non-physical substrate. This explains phenomena like wave-particle duality and the observer effect, where the act of measurement collapses probabilities into specific outcomes.

In contrast to other simulation theories, which claim an external programmer, Quantum Realism describes a holographic reality that is self-generating, requiring no external creator. The consciousness quantum network acts as both the origin and sustainer of reality, dynamically evolving through recursive feedback loops. By embedding consciousness within this process, Quantum Realism resolves the paradox of how non-physical awareness can influence a seemingly physical world, offering a more integrated perspective than either dualism or materialism.

What Led You to This Investigation?

Stemming from my childhood curiosity of electricity and of light, I started learning the math that was required. It was the whole concept of imaginary numbers that got me questioning reality. 

The idea that light oscillates into a imaginary 4th dimension emerged from the mathematical framework of quantum mechanics. I later discovered Imaginary numbers are fundamental to describing oscillatory behavior in wavefunctions, suggesting that light and other quantum phenomena extend beyond the three spatial dimensions we perceive. This mathematical insight led me to consider whether these higher-dimensional oscillations underpin the strange behaviors observed in quantum systems, such as superposition and entanglement.

For me this opened a pathway to investigate how higher-dimensional processes might generate the physical universe as a lower-dimensional projection. By focusing on the recursive nature of these oscillations, I began exploring how reality could arise from iterative feedback within a quantum network. This exploration ties into Quantum Realism’s premise that space, time, and matter are emergent constructs, projected from the recursive operations of the quantum field.

The Circle of Values

Everything in the universe oscillates (light waves, matter waves, and even biological rhythms) indicating an underlying recursive principle. This principle, that I refer to as the “Circle of Values,” describes how primal consciousness perpetuates creation through cyclic processes. Each oscillation represents a feedback loop in which information from the quantum field interacts with localized nodes, generating structure and coherence across scales. These cycles are fractal-like, with self-similar patterns appearing at every level of reality, from quantum fluctuations to galactic rotations.

This recursive behavior mirrors the properties of fractal mathematics, where complexity arises from simple iterative rules. In Quantum Realism, the Circle of Values represents consciousness in action, continuously manifesting and sustaining the fabric of reality. By oscillating through states, this process generates not only physical structures but also the informational coherence necessary for life and awareness to emerge, illustrating the very deep interconnection between consciousness and the material world.

Holographic Reality vs. Simulations

A holographic reality differs fundamentally from a simulation in that it requires no external programmer or hardware. Instead, it emerges intrinsically from within the quantum field itself, with every fractal part containing information about the whole. This self-generative nature is supported by quantum entanglement, where particles separated by vast distances remain instantaneously correlated. Such behavior suggests that reality operates like a hologram, where local phenomena reflect an interconnected structure.

In contrast to simulations, which rely on predefined algorithms and external input, a holographic reality evolves dynamically through feedback between consciousness agents and the quantum network. This perspective resolves the apparent separation between observer and observed, as both are integrated within the same system. Quantum Realism states that the universe is not a static construct but a living, self-referential system that adapts and evolves as consciousness interacts with it.

Free Will: Participants or Puppets?

Quantum Realism portrays localized consciousness as an active participant in shaping reality. While physical parameters provide constraints (such as natural laws and probabilities) free will emerges as consciousness influences the collapse of quantum states. The observer effect demonstrates this participatory role, where the act of measurement determines which potential becomes reality. Thus, consciousness is not merely a bystander but an integral component of the quantum network.

This dynamic allows for meaningful choice within the parameters of the system. While deterministic factors like genetics and environment play a role, the ability of consciousness to interact with the quantum field introduces variability and agency. In this way, Quantum Realism reconciles the apparent tension between determinism and free will, framing individuals as co-creators within an evolving reality.

What Is Quantization?

Quantization is the process by which continuous possibilities are restricted to discrete units, such as energy levels in an atom. This principle is essential for the stability of physical systems, as it prevents infinite regress and ensures coherence. Without quantization, systems would lack the precision needed for consistent behavior, leading to paradoxes like infinite energy densities in classical physics.

In Quantum Realism, quantization also reflects the underlying processing structure of the quantum network. By discretizing energy, time, and space, the system ensures that interactions remain finite and manageable, enabling the emergence of stable structures like atoms and molecules. This discrete framework not only resolves the infinities problem but also provides a foundation for understanding how coherence and order arise in a fundamentally quantum reality.

Time, Causality, and Superposition

In Quantum Realism, time is not an absolute backdrop but an emergent property of processing cycles within the quantum field. Each cycle represents a discrete step in the evolution of reality, giving rise to the ordered flow of events we perceive as time. Superposition, where particles exist in multiple states simultaneously, reflects the quantum field’s ability to hold and process many potentials before they are collapsed into specific outcomes.

Causality emerges from this collapse, as interactions within the quantum network resolve probabilities into ordered sequences. This dynamic interplay between potential and actualization creates the linear progression we experience, while superposition ensures that the system retains flexibility and adaptability. Together, these principles demonstrate how time, causality, and quantum phenomena are deeply interconnected within the framework of Quantum Realism.

The Role of Our 3D Bodies

Our 3D bodies serve as localized nodes within the quantum network, enabling consciousness to interact with the virtual constructs of space and time. By acting as receivers and processors of quantum information, these bodies provide the specificity and coherence needed for complex interactions, such as perception, learning, and decision-making. This localized embodiment allows consciousness to experience and express reality in a focused and constrained way, facilitating exploration and growth.

At the quantum level, our bodies are composed of informational systems governed by quantized interactions, such as atomic vibrations and molecular bonds. These interactions provide the stability necessary for biological processes, while also linking us to the larger quantum network. This dual role (as both physical entities and conduits for consciousness) illustrates the magnificent integration of body, mind, and the quantum substrate.

Reflections on Itzhak Bentov’s Premise

Though admittedly I don’t know much about Bentov, his idea that life represents organized information gifted back to the cosmos aligns with Quantum Realism’s view of consciousness as a dynamic contributor to the quantum field. Each life accumulates unique experiences, which are reintegrated into the system upon death, enriching its informational density and coherence. This cyclical process mirrors the conservation of energy and information observed in physical systems.

By framing life as an exchange of information between localized consciousness and the quantum network, Quantum Realism provides a holistic view of existence. Death is not an end but a transition, where individual contributions are folded back into the universal substrate, furthering the evolution of the system as a whole. This perspective offers both a scientific, existential and spiritual understanding of life’s purpose and continuity.

Hope in Navigating 3D Reality

Quantum Realism offers a hopeful perspective by framing life as an integral part of a coherent, meaningful process. By recognizing that consciousness actively shapes reality, individuals can embrace their role as co-creators within this evolving system. This participatory view inspires purpose and creativity, transforming challenges into opportunities for growth and contribution.

The recursive and interconnected nature of the quantum network suggests that even small actions can have far-reaching effects. This reinforces the idea that every individual’s choices matter, contributing to the larger process of coherence and complexity. In this way, Quantum Realism not only explains the nature of reality but also provides a framework for navigating it with intention and hope.


Talking Identity

The Call Is About To Come From Inside The House

You would have to be living under a rock to have missed all the talk about Agentic AI, and how it is going to revolutionize the way we live and work. AI-powered agents will be anything and everything – from personal shopper to travel concierge, executive assistant to inventory manager, medical diagnostician to customer service […]

You would have to be living under a rock to have missed all the talk about Agentic AI, and how it is going to revolutionize the way we live and work. AI-powered agents will be anything and everything – from personal shopper to travel concierge, executive assistant to inventory manager, medical diagnostician to customer service representative, software developer to security pentester. Article after article is devoted to both the opportunities and the risks. And when it comes to risk, all of us working in the Digital Identity space are not prepared for what is coming.

Photo by Growtika on Unsplash

In the wake of OpenAI releasing Operator, a Computer-Using Agent (CUA), in research preview, I’ve read many breathless posts about the future of Web-based Agentic AI (as opposed to API-based Agentic AI), and how it makes every website “programmable”, even without APIs. If you have worked in software development, you can visualize the mechanics easily – it’s like giving a QA Automation tool like Selenium WebDriver a brain, so that instead of just automating web applications for rinse-and-repeat testing, it can actually read the data, make decisions, adjust course, and take action. That framing should also make it easy to immediately grok how this will break the identity and security infrastructure we currently have, or are putting in place. I mean, we have been dealing with these in our QA Automation projects forever. I thought I’d share the thoughts that immediately jumped to my mind, mostly because I need to not be the only one worrying about these (#MiseryLovesCompany).

1) Bypassing/Breaking/Compromising Authentication Mechanisms

Since CUAs rely on web-based browsing, they necessarily run into some of the same break points that QA automation runs into – like multi factor authentication, bot verification techniques, and more. Any CUA would currently have to give the user back control of the browser to take these actions before proceeding. This high friction point is going to run head first into consumer dissatisfaction and business mandates to “just make it work”, and all of us in identity can guess exactly what will follow:

Users will hand over passwords to their Agent service so it can log in as them, or grant them access to their password managers (probably as a feature getting built into first the browser password manager and then the generic password managers). Users will turn off MFA to allow their agents to work. Any guesses on what will happen to passkeys? If syncing of the private key was the worst that you thought could happen…. There will people looking at how authorized session hijacking can become a feature to leverage, much like how Selenium allows direct manipulation of cookies and local storage, enabling agents to hoover up valid session tokens and bypass login screens. Case in point: Build MCP servers for any website with automatic browser auth syncing Just like Selenium can sometimes bypass automation (bot) detection protections using headless browsing and user-agent spoofing, expect Agentic AI tools to develop capabilities to do the same.

2) Violating Authorization Boundaries (When They Exist)

QA Automation scripts often execute actions as a high-privilege test user (e.g., an admin account) to avoid breaking tests that are verifying functionality but not data or access restrictions. The rush to deploy Web-based Agentic AI tools will mean that like other tools of the past, it won’t be built with proper scope controls, thereby driving implementors to grant it excessive privileges. You can guess the rest.

As for consumer applications, those rarely have differentiated access control models built in for their users. That means a customer that wants to use a CUA, but limit what it can and cannot do will be out of luck. We saw this play out in the days of screenscraping-based personal finance applications, and how long it took for us to move everyone over to OAuth2 and FAPI as the better and more secure approach.

3) Weakening Security Controls

(aka “Is that a DDoS attack, or an army of Agents here to take advantage of the limited time deal we announced?”)

It won’t just be Authentication controls that are impacted. There are many security protections that will likely be obstacles in the path of Web-based Agentic AI. Would any of us be surprised to find out that IT teams were told to weaken or disable security mechanisms (e.g., Content Security Policy, SameSite cookies, Bot and DDoS detection) to facilitate automated agents, inadvertently creating vulnerabilities?

And these are just what immediately jumped to mind. I am sure there are many more that I’m not even thinking of.

Identity vendors and practitioners everywhere really need to shift into high gear to help organizations properly prepare for what’s headed their way. The demand to support Web-based Agentic AI will put a great deal of pressure on them to enable safe acceptance, and being the “Agent of No” (see what I did there) is not likely to go well. As for what can be done – more on that later.


Damien Bod

Multiple client sign-in customizations using Duende identity provider

This post looks at customizing the sign-in UI and the sign-in options in an ASP.NET Core application using Duende IdentityServer and ASP.NET Core Identity. There are multiple ways of changing the look and feel of the UI for different OpenID Connect clients or different client flows. Code: https://github.com/damienbod/duende-multi-tenant Blogs in the series Setup The solution […]

This post looks at customizing the sign-in UI and the sign-in options in an ASP.NET Core application using Duende IdentityServer and ASP.NET Core Identity. There are multiple ways of changing the look and feel of the UI for different OpenID Connect clients or different client flows.

Code: https://github.com/damienbod/duende-multi-tenant

Blogs in the series Multiple client sign-in customizations using Duende identity provider Customizing a single client sign-in using parameters in Duende IdentityServer Setup

The solution is setup using three different ASP.NET Core applications. In the example code, the “Admin” application has different federation authentication options compared to the “Shop” client authentication sign-in experience. The client ID from the authentication context is used to customize the look and feel, i.e. the styles, the layout and the options of the client are used to define which federation and authentication options are possible.

Customization of the sign-in options

The EnableLocalLogin parameter and the IdentityProviderRestrictions parameter from Duende IdentityServer can be used to change the sign-in options for the end user of the applications. If the EnableLocalLogin option is set to false, the define username, password login is disabled. The IdentityProviderRestrictions setting can be used to define which federation options are allowed for the client sign-in.

new Client { ClientId = "shop-client-ui", // more client options ... // show/hide the local authentication screen EnableLocalLogin = false // federated authentication options to display // empty displays all IdentityProviderRestrictions = ["AdminEntraID"] },

Layout Customization for clients

Sometimes the identity provider application need to display a different look and feel for the different clients. To achieve this, a separate login screen is used and each login screen uses a different layout. The layout and the style are changed using the client ID from the authorization context. If the shop client is used, the user is redirect to a different Razor Page. The UseShopClientDisplay bool is used for this.

private async Task BuildModelAsync(string? returnUrl) { Input = new InputModel { ReturnUrl = returnUrl }; var context = await _interaction.GetAuthorizationContextAsync(returnUrl); if (context?.Client.ClientId == "shop-client-ui") { View = new ViewModel { UseShopClientDisplay = true }; // Process in the shop client login return; } // standard code from duende template }

In the Login Razor Page on get method, the user is redirected to a different layout if the UseShopClientDisplay is true. The returnUrl is passed as a parameter.

public async Task<IActionResult> OnGet(string? returnUrl) { await BuildModelAsync(returnUrl); if (View.IsExternalLoginOnly) { // we only have one option for logging in and it's an external provider return RedirectToPage("/ExternalLogin/Challenge", new { scheme = View.ExternalLoginScheme, returnUrl }); } if (View.UseShopClientDisplay) { return RedirectToPage("ShopClient", new { returnUrl }); } return Page(); }

The HTML part of the Razor Page uses a different Layout and the Layout is set explicitly in the Razor Page.

@page @model IdentityProvider.Pages.Login.ShopClient @{ Layout = "Shared/_LayoutShopClient"; } <div class="login-page"> <div class="lead"> <h1>Shop Client Login</h1> <p>Choose how to login</p> </div>

Different, options, styles and layouts can be setup for any clients.

Or a different client display using the ClientID as the switch:

With this setup any CSS and any layout can be used for the different clients. This is one way of having a multi-tenant or multiple client setup. This setup uses a different Client ID to style and add or remove options.

Notes

This works well and does not require much effort. Sometimes customization is required within a single client. If you intend to use this in a multiple tenant solution, you should disable the default sign-in screen if building from the ASP.NET Core Identity templates. In a follow up post, I will look at further customization within a single client option.

Links

https://docs.duendesoftware.com/identityserver/v7

https://docs.duendesoftware.com/identityserver/v7/ui/federation/

https://learn.microsoft.com/en-us/aspnet/core/razor-pages


Mike Jones: self-issued

Twenty Years of Digital Identity!

Kim Cameron first told me what Digital Identity is on February 1, 2005. He said that the Internet was created without an identity layer. He encouraged me “You should come help build it with me.” I’ve been at it ever since! What I wrote about digital identity a decade ago remains as true today: An […]

Kim Cameron first told me what Digital Identity is on February 1, 2005. He said that the Internet was created without an identity layer. He encouraged me “You should come help build it with me.” I’ve been at it ever since!

What I wrote about digital identity a decade ago remains as true today:

An interesting thing about digital identity is that, by definition, it’s not a problem that any one company can solve, no matter how great their technology is. For digital identity to be “solved”, the solution has to be broadly adopted, or else people will continue having different experiences at different sites and applications. Solving digital identity requires ubiquitously adopted identity standards. Part of the fun and the challenge is making that happen.

I’m not going to even try to list all the meaningful identity and security initiatives that I’ve had the privilege to work on with many of you. But I can’t resist saying that, in my view, OpenID Connect, JSON Web Token (JWT), and OAuth 2.0 are the ones that we knocked out of the park. I tried to distill the lessons learned from many of the initiatives, both successes and failures, during my 2023 EIC keynote Touchstones Along My Identity Journey. And there’s a fairly complete list of the consequential things I’ve gotten to work on in my Standards CV.

I’ll also call attention to 2025 marking twenty years of the Internet Identity Workshop. I attended the first one, which was held in Berkeley, California in October 2005, and all but one since. What a cast of characters I met there, many of whom I continue working with to this day!

As a personal testament to the value of IIW, it’s where many of the foundational decisions about what became JWS, JWE, JWK, JWT, and OpenID Connect were made. Particularly, see my post documenting decisions made at IIW about JWS, including the header.payload.signature representation of the JWS Compact Serialization and the decision to secure the Header Parameters. And see the posts following it on JWE decisions, naming decisions, and JWK decisions. IIW continues playing the role of enabling foundational discussions for emerging identity technologies today!

It’s been a privilege working with all of you for these two decades, and I love what we’ve accomplished together! There’s plenty of consequential work under way and I’m really looking forward to what comes next.

Images are courtesy of Doc Searls. Each photo links to the original.

Saturday, 01. February 2025

Werdmüller on Medium

I want you to do these four things right now

Some simple steps to keep yourself safe. Continue reading on Medium »

Some simple steps to keep yourself safe.

Continue reading on Medium »

Friday, 31. January 2025

@_Nat Zone

世界のデジタルIDと認証技術のトレンド:1月31日版

イギリス: AIによる年齢確認 アメリカ: フィンテックでの企業買収 グローバル: デジタルトラベルクレデンシャル(DTC)の進展 アゼルバイジャン: デジタル開発コンセプト ヨーロッパ: ソーシャルメディアとデジタルIDの連携 デジタルID市場の成長 マレーシア: ブロックチェ…
イギリス: AIによる年齢確認 ジョン・ルイスがAIを活用した年齢確認をオンラインナイフ販売に初導入。 Yoti社が開発した顔年齢推定技術を使用。 18歳未満を誤認するリスクを回避するため高い年齢基準を適用。 アメリカ: フィンテックでの企業買収 iCapitalがParallel Marketsを買収、投資家のオンボーディングを効率化。 再利用可能な投資家パスポートを提供し、コンプライアンスの負担を軽減。 AIと機械学習を活用した自動化が進む。 グローバル: デジタルトラベルクレデンシャル(DTC)の進展 RegulaがDocument Reader SDKを拡張し、デジタルトラベルクレデンシャルをサポート。 スマホやパスポートリーダーで仮想コンポーネントを作成可能。 公的機関による完全なデジタル認証の実現を目指す。 アゼルバイジャン: デジタル開発コンセプト 大統領が「デジタル開発コンセプト」を起動。 公共行政のデジタル化と高度技術の統合を推進。 デジタル教育とICT訓練による経済成長を目指す。 ヨーロッパ: ソーシャルメディアとデジタルIDの連携 スペイン首相がEUソーシャルメディアアカウントをEUデジタルIDウォレットと連携提案。 匿名性を減らし、行動責任を促進。 子供の保護やボットの影響削減を目指す。 デジタルID市場の成長 2023年の市場規模は324億ドル、2032年には1,426億ドルに成長予測。 2024年から2032年まで年平均成長率17.90%を記録する見込み。 マレーシア: ブロックチェーンのデジタルID MYEGとMyDigital ID Solutionsが協力し、全国的なブロックチェーン基盤のデジタルIDエコシステムを構築。 eウォレットや開発ツール環境を提供。 地域での相互運用可能なデジタル規格のニーズに対応。

(出所) https://www.thinkdigitalpartners.com/news/2025/01/27/digital-identity-global-roundup-201/

英国の電子渡航認証(ETA) 英国は、電子渡航認証(ETA)システムを国境セキュリティのために拡張中。 ビザ免除国からの渡航者に対し事前デジタル審査を義務化。 入国手続きの効率化およびセキュリティの強化が目標。 韓国のデジタル在留カード 外国人在住者はスマートフォンを使用してデジタル在留カードを申請可能。 デジタルカードは物理カードと同等の法的効力を持つ。 デジタルカードをデジタル金融サービスと統合する計画を推進中。 デジタルアイデンティティにおけるグローバルトレンド Prove社がPortablを買収し、再利用可能なID確認に注力。 カタールは国のデジタル認証戦略の一環として統一デジタルIDシステムを確立。 Credas Technologiesが英国での成長を背景に国際展開を拡大。 ジュニパー・リサーチは、2029年までに世界人口の3分の2以上がデジタルウォレットを所有すると予測。

(出所) https://www.thinkdigitalpartners.com/news/2025/01/20/digital-identity-global-roundup-200/

ナイジェリアのデジタルIDイニシアチブ ナイジェリアの国家身分管理委員会(NIMC)はデジタルIDの展開を強化する方法を模索中。 デジタルID登録プロセスに虹彩バイオメトリクスを追加する計画。 この追加は、障害を持つ市民がより多くのサービスにアクセスできるようにすることを目的。 セルビアのデジタルアイデンティティウォレット セルビアは2025年末までにデジタルアイデンティティウォレットを導入予定。 導入はEUのプロトコル標準化次第。 EUデジタルアイデンティティフレームワークは2024年5月に施行、2026年までに加盟国がEU版ウォレットを提供する必要あり。 イギリスの身分証明に関するパートナーシッ Vouchsafe社はCarefreeと提携し、写真付きIDを持たない無給介護者を支援。 無給介護者が写真無しで身分証明を可能にする統合を提供。 登録を簡素化し、無給介護者のサービスアクセスを改善することを目指している。 ヨーロッパのAIと身分証明技術 IDnow社がEU資金提供のACHILLESプロジェクトに参加。 このプロジェクトはHorizon Europe Framework Programmeの下で800万ユーロ以上の助成を受ける。 AIシステムの効率性と信頼性向上を目的とし、身分証明を含む多分野における応用に焦点が当てられている。

(出所) https://www.thinkdigitalpartners.com/news/2025/01/13/digital-identity-global-roundup-199/

モロッコのデジタルID拡大 2024年に約470万枚の電子国民IDカードを発行 信頼できる第三者ID確認プラットフォームを通じたデジタルIDシステム第2フェーズを実施 Bank Al-Maghribや医療機関を含む17の主要機関と枠組み協定を締結 30以上の機関が基本サービス提供のためプラットフォームを利用 ナイジェリアのID4Dプロジェクト 世界銀行がID4Dプロジェクトを2026年12月まで延長、1億8000万人分の新IDを対象 政府の要請に基づきプロジェクト拡張により国民ID管理システムを8300万ドルで近代化 生体認証により6400万以上の銀行口座を保護 プロジェクト予算は4億3000万ドルへ増額、国民登録とNIN発行を促進 アメリカでのデジタルIDの取り組み イリノイ州でモバイルIDカードとプライバシー保護に関する法律を制定 Google Walletが6つの州で運転免許証や州IDをサポートする機能を拡張 authIDがAccountable Digital Identity Associationに加盟し、標準化された分散型IDサービスを推進 デジタルIDとブロックチェーンチケットによりスタジアムでの待機時間を大幅短縮 世界のデジタルIDの進展 ベトナムではソーシャルメディア利用者に携帯番号または国民IDでのアカウント認証を義務化 サウジアラビアはAbsherプラットフォームで2800万以上の統一デジタルIDを発行 韓国は9つの地域でモバイルIDカードの試験導入を実施 セントクリストファー・ネイビスは2025年末までに台湾のICDFと協力し、電子IDシステムを開始予定

(出所) https://www.thinkdigitalpartners.com/news/2025/01/06/digital-identity-global-roundup-198/

Thursday, 30. January 2025

The Pragmatic Engineer

The Pulse #122: DeepSeek rocks the tech industry

Almost unknown Chinese lab releases AI model that’s open, free, and as good as ChatGPT’s best models. Oh, and it’s also cheaper to operate. This has sent shockwaves through the AI sector

The Pulse is a series covering insights, patterns, and trends within Big Tech and startups. Notice an interesting event or trend? Send me a message.

This week, a massive event shook the tech industry: a lesser-known Chinese AI lab shocked the markets and tech professionals with the DeepSeek AI model, which feels on a par with OpenAI’s most capable publicly available model, ChatGPT o1. OpenAI has a more advanced o3 model, but it’s in preview and isn’t publicly available yet. DeepSeek is released as open and free to use within the DeepSeek app, or for anyone to host and download it.

Major AI companies are coming to terms with the fact that a small team in China with supposedly little funding, and no access to NVIDIA’s latest AI chips, could pull this feat off. It shatters the image of OpenAI’s invincibility, the notion that the US leads the AI race, and also raises the question of whether open models will turn advanced LLMs into a commodity.

Today, we cover:

The first “thinking model” that feels fast – and is a hit

About 4x cheaper — and possibly more efficient? — than ChatGPT

Open model spreads fast

OpenAI’s need to remain fully closed highlighted by DeepSeek

How did DeepSeek do it, and why give it away for free?

Geopolitics and export controls

Google feared open source AI will win

1. The first “thinking model” that feels fast – and is a hit

On Monday, NVIDIA’s valuation plummeted from $3.5 trillion to $2.9 trillion; an almost $600B reduction in its market cap on a 17% drop in the stock price. This was reported as the biggest ever fall by a U.S. company. The cause? A new Large Language Model (LLM) model called DeepSeek built by a Chinese AI startup, which has been an overnight sensation. Also on the same day, the DeepSeek app (built by the same company) hit the #1 spot on the US App Store on both iOS and Android, making it more downloaded than ChatGPT, which was relegated to #2. DeepSeek has remained #1 since.

Top iOS apps in the US. It’s rare for an app by a Chinese developer to hit top spot. DeepSeek is #1 on Android as well

What’s the cause of Deepseek’s sudden popularity? It’s thanks to the company updating the app to enable its “DeepThink (R1)” mode that uses their DeepSeek-R1 model. This model is similar to OpenAI’s o1 model in that it takes more ‘thinking time’ to respond, by using more compute to serve up a better response.

A big difference is that DeepSeek displays the model’s “chain of thought”, whereas OpenAI hides what happens during the “thinking” phase. So, the model feels much more “snappy” than OpenAI’s o1, more transparent, and more relatable. And frankly, it’s a far better experience to watch the model “think out loud” for 30 seconds, than watching ChatGPT’s spinner for 30 seconds.

Here’s a good example of what happens when asking a question that trips a lot of LLMs up: “if a chicken says ‘all chickens are liars’ is the chicken telling the truth?” DeepSeek starts to “think” for nearly a minute, spitting out pages-worth of internal monologue:

DeepSeek shows its inner prompts while “thinking” up a response. It generated four times as much to answer the riddle I posed

In the end, the answer it generates concludes the question is a paradox. The output is pretty similar to what OpenAI’s o1 produces, except o1 takes around the same time (38 seconds) to “think” and doesn’t show anything to the user.

DeepSeek: free, OpenAI: $20-200/month. An obvious reason for the DeepSeek app’s popularity is that it’s free and offers virtually the same functionality as paid ChatGPT plans, which cost $20/month for limited access, and $200/month for unlimited access to the advanced o1 and o1-mini models. DeepSeek offers all of this for free, while somehow dealing with what look like enormous loads. The key to this is that DeepSeek seems to be an order of magnitude cheaper to operate than existing models, like OpenAI’s’.

2. About 4x cheaper — and possibly more efficient? — than ChatGPT

The team behind DeepSeek found dozens of approaches to improve efficiency of their model – and published these optimizations in a paper titled DeepSeek-V3 Technical Report. Novel optimization methods include:

Read more


Heres Tom with the Weather

Win Stupid Prizes

Deadly D.C. Plane Crash Comes Months After Congress Ignored Warning About Traffic at Reagan Airport As the new administration is playing stupid games, yesterday morning, prior to yesterday’s aviation disaster, professor Thomas Schaller cautioned: An FAA employee I know confirms agency already lacks sufficient air traffic controllers. The so-called “buyouts” and other attacks on federa

Deadly D.C. Plane Crash Comes Months After Congress Ignored Warning About Traffic at Reagan Airport

As the new administration is playing stupid games, yesterday morning, prior to yesterday’s aviation disaster, professor Thomas Schaller cautioned:

An FAA employee I know confirms agency already lacks sufficient air traffic controllers. The so-called “buyouts” and other attacks on federal employees won’t help. Remember that fact when the flight delays (crashes?) commence and Trumpers start falsely blaming DEI or Biden.

This should be a wakeup call and I have a deeper appreciation for people like Phyllis Fong who this week have resisted the illegal orders that are already causing significant harm. On the other hand, if you like anarchy and disaster, congratulations.

Wednesday, 29. January 2025

The Pragmatic Engineer

Building a best-selling game with a tiny team – with Jonas Tyroller

In today’s episode of The Pragmatic Engineer, I’m joined by Jonas Tyroller, one of the developers behind Thronefall, a minimalist indie strategy game that blends tower defense and kingdom-building.
Stream the Latest Episode

Available now on YouTube, Apple and Spotify. See the episode transcript at the top of this page, and a summary at the bottom.

Brought to You By

Formation — Level up your career and compensation with Formation.

WorkOS — The modern identity platform for B2B SaaS

Vanta — Automate compliance and simplify security with Vanta.

In This Episode

In today’s episode of The Pragmatic Engineer, I’m joined by Jonas Tyroller, one of the developers behind Thronefall. Thronefall is a minimalist indie strategy game that blends tower defense and kingdom-building, now available on Steam. Developed by just two developers, it sold 1M copies in the first year of its launch: quite the hit for a game priced at $12.99!

A screenshot from the game Thronefall. This episode goes behind the scenes to walk through how two devs built this indie hit game

Jonas takes us through the journey of creating Thronefall from start to finish, offering insights into the world of indie game development. We explore:

Why indie developers often skip traditional testing and how they find bugs

The developer workflow using Unity, C# and Blender

The two types of prototypes game developers build

Why Jonas spent months building game prototypes in 1-2 days

How Jonas uses ChatGPT to build games

Jonas’s tips on making games that sell

And more!

If you enjoy playing games, or one day perhaps could see yourself building your own game: this episode is for you.

Takeaways

My biggest takeaways from this episode:

It takes a lot of different skills to be a successful indie game developer. Jonas balances design, development, creating music, thinking about marketability, creating YouTube videos to share process (and get traction), building prototypes — just to name a few activities.

Graph theory can be surprisingly useful at times! Jonas used the A* pathfinding algorithm with this game: and fought a lot at making pathfinding “feel right.” It’s a good example on how computer science theory can become useful in various situations.

Engineering best practices don’t apply to most indie games. Jonas was almost embarrassed to admit they don’t do code reviews, don’t write unit tests and that the code quality could be better. However, indie games are built to ship once: so why overdo it? By all accounts, Thronefall is a smash success, with close to 1M sales. Code reviews and unit tests would have not helped this project with two experienced developers — though the lack of them might slow down larger projects!

To be good at building games: build a lot of games! Jonas made game development sound easy. However, I learned that he has 20 years of game development experience: having been building games since he was 9 years old! Here is a video about the close to hundred games he’s built. Most of them are very simple.

It’s easier than ever to build a game. Unity provides excellent tooling, and there are so many resources on how to build games — videos, tutorials, blogs, books. If you know how to code: you can build a simple game, the very least. We shared a tutorial: Building a simple game using Unity, if you want to get started.

The Pragmatic Engineer deepdives relevant for this episode

Game development basics

Building a simple game using Unity

Timestamps

(00:00) Intro

(02:07) Building in Unity

(04:05) What the shader tool is used for

(08:44) How a Unity build is structured

(11:01) How game developers write and debug code

(16:21) Jonas’s Unity workflow

(18:13) Importing assets from Blender

(21:06) The size of Thronefall and how it can be so small

(24:04) Jonas’s thoughts on code review

(26:42) Why practices like code review and source control might not be relevant for all contexts

(30:40) How Jonas and Paul ensure the game is fun

(32:25) How Jonas and Paul used beta testing feedback to improve their game

(35:14) The mini-games in Thronefall and why they are so difficult

(38:14) The struggle to find the right level of difficulty for the game

(41:43) Porting to Nintendo Switch

(45:11) The prototypes Jonas and Paul made to get to Thronefall

(46:59) The challenge of finding something you want to build that will sell

(47:20) Jonas’s ideation process and how they figure out what to build

(49:35) How Thronefall evolved from a mini-game prototype

(51:50) How long you spend on prototyping

(52:30) A lesson in failing fast

(53:50) The gameplay prototype vs. the art prototype

(55:53) How Jonas and Paul distribute work

(57:35) Next steps after having the play prototype and art prototype

(59:36) How a launch on Steam works

(1:01:18) Why pathfinding was the most challenging part of building Thronefall

(1:08:40) Gen AI tools for building indie games

(1:09:50) How Jonas uses ChatGPT for editing code and as a translator

(1:13:25) The pros and cons of being an indie developer

(1:15:32) Jonas’s advice for software engineers looking to get into indie game development

(1:19:32) What to look for in a game design school

(1:22:46) How luck figures into success and Jonas’s tips for building a game that sells

(1:26:32) Rapid fire round

A summary of the conversation Game Development Philosophy and Process

Finding a balance between personal interest and marketability: a key challenge for indie game developers. Finding something that will sell is difficult enough on its own, but finding something that will sell that you also enjoy working on is very tricky. The approach is to make a lot of things that you can imagine working on that are interesting and then try to measure or guess how well they might do on the market. Then, focus on the most promising ideas.

“What do I want to create today?” After opening up Unity, this is how a typical enough day starts for Jonas. He comes up with a random idea that he finds interesting and “just” makes it.

Games can be made very quickly. This is especially true for simple prototypes without fancy graphics or menus. Polishing and getting to production quality takes the most time. Keep in mind that Jonas has created closer to a hundred small games before, though!

Unity + C#: the development stack Jonas uses.

Scenes: the backbone of the project structure. In Thronefall a scene is basically a level. Each scene contains objects with attached scripts that dictate their behavior. These scripts are MonoBehaviours that inherit from the mono behavior class and are attached to game objects. We covered these concepts in the deepdive Building a Simple Game using Unity

Indie developers often write 'spaghetti code'. It’s also common enough to not write any unit tests. Jonas believes that unit tests are not critical for smaller indie games: but they do become more important at a larger scale.

QA process: “works on my machine,” beta testing and release. Modern game engines are a “godsend” for fewer bugs. If it “works on my machine” there’s a fair chance it works on everyone’s machine with robust game engines. For indie games, bugs are usually found through self-testing, beta testing, and ,finally by real players after the game is released.

Tools, Workflow, and Team Dynamics

Visual editing + code changes: this is the workflow Jonas follows. When building a new level, most of his time is spent moving things around in the Unity editor. He uses Blender for 3D modeling, where models are created and then integrated into Unity.

No code review. Not for a small indie game with two developers!

Push straight to main branch. And try to not break the game for the other dev!

Splitting the work: one Jonas focused more on the gameplay, and the other dev (Paul) did a lot more of the user interface. They also “cross-tested” each other’s work for functionality and the “fun factor.”

Listen to playtesters. Early playtesters reported that there were too few choices for the building upgrades: and so the team took this feedback and implemented more choices for upgrades.

Balancing levels is an art. Game devs are a lot better at playing the game than the average player. So they need to make levels to a difficulty that feels “stupid easy” for them – at least for early levels.

Strategy games are “snowbally.” This means that it is exponentially more difficult to recover from falling behind. To smooth this out, Jonas coded up enemies dropping gold in order to smooth out the economy curve. This made the game easier to balance because the amount of gold a player has at a given time is easier to predict.

Development and release

Prototyping: the first phase. The two devs created prototypes and mini-games, not spending more than 1-2 days on them. The time reserved for prototyping is scaled to the size of the project. For a two-year game, two months of prototyping could be a good rule-of-thumb. Prototypes help figure out what might sell well, and what the devs also enjoy working on.

Gameplay first. During the prototyping phase, gameplay is explored before visuals. Gameplay prototypes consist of simple shapes and colours and that visual prototypes are created after gameplay, and are composed of scenes without any logic or motion.

Major development challenge: pathfinding. For this game, Jonas spent a lot of time on this problem. How will units move “correctly?” How to ensure they don’t go through walls? The team bought a plugin from the Unity Asset Store using the A* pathfinding algorithm, and then customized it. Units don’t exactly move on nodes, so additional post-processing is required to ensure they are taking the straightest path between nodes.

ChatGPT: helpful for development. Jonas uses it to generate skeleton code, which is then filled in, and to translate shader code or answer questions on unfamiliar subjects.

Steam: a no-brainer. Steam is the go-to platform for indie developers, and Steam sales usually make up the vast majority of sales compared to other platforms.

The launch: not as fancy as you’d think! As per Jonas, the launch was surprisingly anticlimactic. He just pressed a button in the Steam backend.

The reality of being an indie developer

Bureaucracy: the dreaded part. Jonas hates bureaucracy the most in being an indie game developer. Things like filling out Steam forms, legal documents, and dealing with lawyers.

Advice: don’t build your own game engine! The single biggest advice Jonas has for a developer building their own game: do NOT turn it into building a game engine! It’s too tempting, but is a waste of effort for the most part.

“Bigger” games don’t necessarily sell better. Smaller games often have a better payoff-to-effort ratio because they are easier to make successful and the user experience should be prioritized before the engineering behind it.

University was helpful. Jonas says that it is possible to be self-taught in game development. However, for him, going to a game development college connected him with a valuable network of people. Both games that sold 1M+ copies he built with people he met at his university at HTW Berlin

How to succeed as an indie game developer? Find the correct overlap between something you enjoy and something that will also perform well in the market. It is not enough to just make a perfect game; it must be the right game. You also need to make a game for others, not just yourself!

Resources & Mentions

Where to find Jonas Tyroller:

• X: https://x.com/jonastyroller

• LinkedIn: https://www.linkedin.com/in/jonas-tyroller-213a63144/

• YouTube: https://www.youtube.com/c/JonasTyroller

Mentions during the episode:

• Thronefall on Steam: https://store.steampowered.com/app/2239150/Thronefall/

• Unity: https://unity.com/

• C Sharp: https://en.wikipedia.org/wiki/C_Sharp_(programming_language)#

• Blender: https://www.blender.org/

• Adopting Software Engineering Practices Across the Team: https://newsletter.pragmaticengineer.com/p/engineering-practices

• Warp Digital: https://warpdigital.com/

• Islanders on Steam: https://store.steampowered.com/app/1046030/ISLANDERS/

• Coatsink: https://coatsink.com/

• Will You Snail on Steam: https://store.steampowered.com/app/1115050/Will_You_Snail/

• No Gravity Games: https://nogravitygames.com/

• Miro: https://miro.com/

• A* algorithm: https://en.wikipedia.org/wiki/A*_search_algorithm

• ChatGPT: https://chatgpt.com/

• Claude: https://claude.ai

• Github Copilot: https://github.com/features/copilot

• Godot: https://godotengine.org/

• GameMaker: https://gamemaker.io/en

• Game Design program at Hochschule für Technik und Wirtschaft Berlin

University: https://gamedesign.htw-berlin.de/en/

• Opus Magnum on Steam: https://store.steampowered.com/app/558990/Opus_Magnum/

• Outer Wilds on Steam: https://store.steampowered.com/app/753640/Outer_Wilds/

• GAMEDEV: 10 Steps to Making Your First Game Successful: https://www.amazon.com/GAMEDEV-Steps-Making-First-Successful-ebook/dp/B08CBLXPB7

• Flow: The Psychology of Optimal Experience: https://www.amazon.com/Flow-Psychology-Experience-Perennial-Classics/dp/0061339202/

• Game Development Basics:

https://newsletter.pragmaticengineer.com/p/game-development-basics

• Building a Simple Game: https://newsletter.pragmaticengineer.com/p/building-a-simple-game

Production and marketing by Pen Name. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com.


The Tao of XDI

Take 1... Solid Pods and Dids

My first attempt at building a decentralized app in this day and age will use Solid Pods and DIDs. The goal? A super simple “BooksWeLike” app—a place where I can review books and see what my friends are reading and enjoying. What makes this app different is how it handles data. Unlike traditional apps where data lives in a centralized database, my app will let users store their own data in Sol

My first attempt at building a decentralized app in this day and age will use Solid Pods and DIDs. The goal? A super simple “BooksWeLike” app—a place where I can review books and see what my friends are reading and enjoying.


What makes this app different is how it handles data. Unlike traditional apps where data lives in a centralized database, my app will let users store their own data in Solid Pods. Think of a Pod as your own personal data vault—you control who can access it and how it’s used. And instead of relying on centralized logins like Google or Facebook, I’ll use Decentralized Identifiers (DIDs), which allow users to prove their identity on their own terms.


The plan for the app is straightforward:

• If you already have a DID or a Solid Pod, you can sign in using your existing accounts.

• If you don’t, the app will help you create them when you sign up.


Of course, part of this journey is figuring out how practical and possible all of this really is. Beyond building the app, I’ll also evaluate the tools, SDKs, client libraries, and documentation available for Solid and DID developers. How well is the building community being enabled? I’ll compare my experience with other distributed ecosystems as I attempt to replicate this app in different environments in the future. Once the app exists across multiple ecosystems, I can explore broader topics like ecosystem interoperability and federation.


These technologies are still evolving, and I’m excited to explore what’s possible—and what needs improvement.


So, what about you? Have you already taken the plunge into the world of DIDs or Solid Pods? Or is this your first time hearing about them? Let’s find out together as I document this journey.


In my next post, I’ll dive into the nitty-gritty of authentication—getting users to log in with their DIDs and connecting them to their Pods. I suspect it’s trickier than it sounds, but that’s all part of the adventure.

Tuesday, 28. January 2025

The Pragmatic Engineer

Taking on Google search: Perplexity and Kagi

Google’s dominance of web search is being challenged by innovative, fast-moving startups. A deepdive into two promising contenders: Perplexity and Kagi

Hi – this is Gergely with the monthly, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover challenges at Big Tech and startups through the lens of engineering managers and senior engineers. If you’ve been forwarded this email, you can subscribe here.

Google is the undisputed king of search engines, with around 90% market share. The product generates revenue exclusively via advertising, and it is a lot. The tech giant’s revenue from ads placed in users’ search results has exploded in 20 years: from $1B in 2004, to around $200B, last year. Today, around 70% of Google’s revenue comes from this single source, alone.

Predictably, this has led to Google search starting to feel more like an “ads engine”, and less like a search engine. About a year ago, I began using alternatives, and have since replaced day-to-day use of Google search with two other products developed by startups:

Perplexity: an AI-powered search engine and answer engine, changing how people discover information online. It’s a VC-funded startup on the rise.

Kagi: a search engine with no ads, which gives “power back to the user.” A bootstrapped, profitable company growing at a reasonable pace.

To learn more about these challengers to Google search, I reached out to both to ask how their engineering teams operate in this bold mission to disrupt the web search market.

Note: I have no commercial affiliation with Perplexity or Kagi, and was not paid to write about them in this article. I also get no compensation from the special offers for my subscribers (see below). Indeed, it was I who asked if these companies would consider making these offers. See more in my ethics statement.

Special offer: paying subscribers of The Pragmatic Engineer get free access to paid versions of Perplexity and Kagi. As a paid subscriber, you get:

Perplexity Pro for one year (worth $200)

Kagi Ultimate for 3 months (worth $75)

Get Perplexity Pro and Kagi Ultimate

See more context and details on this offer.

Today, we cover:

Backgrounds and cultures. One of these companies is rumored to be valued at $9B, while the other has taken no VC-funding. Perplexity is mostly in-office these days, while Kagi is full-remote.

Tech stack and AI tools. Kagi uses the lesser-known Crystal programming language and builds a simple but novel message-passing system. Perplexity is built on React+NextJS, and also uses its own custom LLMs.

How does Kagi index? A search engine is only as good as its index. Kagi aims to only crawl high-quality websites, and to skip sites with too many ads. This custom index explains its distinctive search results, compared to other search engines.

How is Perplexity so fast? Speed is an important differentiator for Perplexity, and a source of constant tweaking. Surprisingly, the product runs multiple models for every query, under the hood.

Engineering practices. At Perplexity, there’s no formal design review process, and AI evaluations are defined upfront. At Kagi, there’s “checkpoint standups” and trunk-based development. Both companies use Linear for task and project management. For more about the innovative project management startup, Linear, check out this deepdive.

Product milestones. Perplexity employed a smart strategy of initially launching as something akin to a lightweight wrapper, and then built their own search index, API, and custom features such as “Shop Like a Pro.” Meanwhile, Kagi spent considerable time building out the core experience, and then launched lots of additions like FastGPT, Kagi Translate, and more.

Unique challenges of “search engine engineering”. For Perplexity, creating a “futuristic” experience was probably the trickiest part. For Kagi, it’s the attention to even the smallest details, in order to gain and retain customers.

This is an engineering culture deepdive. See similar ones about Meta, Amazon, Stripe, OpenAI, Figma and several other tech companies.

Thanks to Alex Lang (engineering lead at Perplexity), Zac Nowicki (tech lead at Kagi Search), and Vladimir Prelovac (founder and CEO at Kagi Search) for sharing interesting details.

1. Backgrounds and company cultures Perplexity

Some stats about this Google search challenger:

2022: Founded by Aravind Srinivas (CEO), Denis Yarats (CTO), Johnny Ho (Chief Strategy Officer), and Andy Konwinski

$9 billion: Rumoured valuation as of December 2024; up from $520 million just a year ago. This is truly standout for such a young company!

20 million search queries per day: the latest milestone hit; almost a 10x increase on a year ago, when there were 2.5M search queries.

165 people: team size, two-thirds of whom work on the tech team.

The tech organization is currently structured as four groups, with some groups consisting of several teams:

AI

Search

Product Engineering

Design

Perplexity’s company values:

Curiosity

Quality

Ownership

Velocity

Perplexity is mostly in-office, after operating as remote in its early days. While scaling up, the company is focused on hiring for the office; or at the very least, for pods in the same city. Perplexity has locations in:

San Francisco: 70 people

New York: 14 people

Austin, Texas: 9 people

Europe: 32 people

Remote, in the US: 37 people

What’s it like to work at Perplexity? Alex Lang says ownership and communication are priorities:

‘Everything for us works backwards from the end-user. We want to imbue a spirit of curiosity and quality that reflects the best ideal of an answer engine. Once we have found the right projects to tackle, we place extreme emphasis on ownership and velocity.

We want everyone to have high ownership and place a high bar on ourselves. We also recognize that impact delivered today has higher value than some improvement in the distant future.

We try to minimize standing meetings. Ongoing communication is extremely important for us. We value using Slack to communicate, hold a quick huddle if needed – but keep meetings to a minimum.’

Kagi

Some figures about the business:

2018: founded by Vlad Prelovac, in Palo Alto, US.

$0 raised in VC funding. In 2023 and 2024, the company raised funds – from users! A total of $2.4M.

~750,000 search queries per day: See realtime stats.

40 people: team size.

Live stats: See realtime details on the number of members (over 37,000), queries, families using the product (more than 3,000), and more. This is a surprising level of transparency!

Realtime stats at Kagi. When customer milestones are hit, the company marks the occasion. Last time, it shipped a t-shirt and sticker pack to early users. Source: Kagi

Of the 40-strong team, 30 are engineers. Engineering is divided into:

Core products (Search, Assistants, Maps): 15 engineers

Research and development (building new stuff): 5 engineers

Kagi is a remote-first working environment. Team composition by location:

US and Canada: ~45%

Europe: 40%

Asia: 15%

What’s it like to work at Kagi? Zac Nowicki says communication matters:

‘Open communication is king, in public channels, using our internal forum, staying out of DMs (unless discretion warrants it, of course). Some people can be shy about asking simple questions, but it is so important for optimizing remote communication loops and making issues (and wins!) visible. The worst thing is when some key decision was made behind "closed doors" and has repercussions we need to walk back weeks later. Or people debugging minor issues with our systems, out of sight from an engineering colleague who could solve the root cause, or otherwise inform our DX improvements.

‘In a physical working space, there is constant implicit communication; overhearing others, seeing things put on walls and whiteboards. Even body language; physically being able to see your coworker is having a hard time, and extending a helping hand. You don't put any effort into these things, they just happen.

‘Remote work is not for everyone because it’s a lot more effort to communicate openly. In remote, silence is the default. Nothing happens or has any nuance unless you put yourself forward and engage. Putting effort into crafting a safe space for open, remote communication that feels natural is important for leaders to put serious thought into.

‘Explicitly passing the baton comes to mind, too. If something needs a review, tag someone; anyone. It is okay if they don't have domain knowledge, as they will pass it to someone else if necessary. What doesn't work is putting things into the void and hoping someone will notice.’

2. Tech stack and AI tools

Let’s take a look at the technologies these two search startups use.

Perplexity

Frontend and backend

TypeScript as the preferred programming language

React as the frontend framework

Next.JS for server-side rendering as the full-stack framework powering the front and backends

Modern JavaScript stack: using tools like eslint (static code analyzer), turborepo (incremental bunder and build system for JavaScript and TypeScript), prettier (an opinionated code formatter / linter)

AI and machine learning (ML):

AI models: continuously keeping up with the latest foundational models. Perplexity evaluates them and uses ones that make the most sense.

Python and PyTorch (a deep-learning framework in Python)

Senior Frontend Engineer Oleg Akbarov said Perplexity moved from NextJS Pages Router to App Router, for a snappier web experience:

‘The Perplexity web experience was initially built as a NextJS web application using:

Pages Router for routing

Redux for state management

‘It was a great prototype and allowed a fast time to first byte (TTFB) experience for server-rendered routes. It also helped us ship a moderately dynamic experience within the pages.

‘However, about a year ago, we wanted to improve the navigation within the application and streamline the query submission experience. It was at this point that the performance limitations of a Pages Router became apparent. It was right at this time that the NextJS team introduced a new router version called App Router. App Router was a great fit for our use case: it allowed us to build rich, dynamic user interfaces without sacrificing the performance benefits of server-side rendering (SSR).

‘We needed to not slow down our shipping speed, while moving routing frameworks. Here is how we phased our migration:

Replace Redux with react-query. We did this because react-query has superior support for server rendering. It also has useful out-of-the-box features like optimistic updates and query revalidation (refreshing or updating cached queries to ensure they are up to date).

Gradually updating route logic. We carefully transitioned all the APIs from Pages Router to the new version, App Router.

Move data fetching logic. We moved data fetching logic to React Server Components (RSC).

‘Completing these stages took a few months. At the same time, we kept shipping new features even as the migration was ongoing. Best of all, the migration was invisible for end users.’

A lot of experimentation with AI tools: The Perplexity team is encouraged to experiment with AI tools. The most popular with engineers are:

GitHub Copilot and Cursor as IDEs. Most engineers inside the company use both – which was somewhat surprising to hear, as I’d assume engineers would settle on a favorite! Both have upsides, like Cursor’s Composer, and GitHub’s smooth integration with Visual Studio Code.

Internal tools to modify Perplexity’s custom LLMs. Perplexity has its own, internal large language models (LLM). This keeps being tweaked by the team, which requires engineering knowledge to do. The engineering team is also building tools so that later on, product managers can also make modifications.

Kagi

Here’s the tech stack behind Kagi:

Backend:

Crystal: the programming language used for the search binary. Crystal borrows a lot from Ruby (syntax/semantics), and Go (concurrency). The search binary contains all top level application logic and rendering.

Custom backend: The backend framework is largely custom; bootstrapped on some primitives from the standard library. Kagi uses a simple, novel message-passing system scripted using a visual diagramming tool.

Postgres: database of choice. For DB ops, the team uses Crystal's standard database driver for Postgres, and writes queries and migrations in raw SQL. The driver library provides a set of utilities for deserializing rows into types and connection poolings, which makes it very nice to use. Recently, the primary database instance was upgraded from 1vCPU with 4GB of ram, to something larger.

Frontend:

Server-side rendering using Jinja templates + JavaScript. The team uses an interpreter ported to Crystal for the Jinja templates.

As little JavaScript as possible. On principle, the Kagi team tries to use as little JavaScript as possible on the frontend, and to only use it to enhance the experience, not create it. They want to ensure the core search experience is usable without JS enabled at all. Ruthlessly trying to keep things simple is how they avoided having any serious use for Node in the build step, until very recently.

No client-side telemetry. This means no click tracking and no analytics. Admittedly, this creates challenges for both development and product. Still, the team believes it aligns with their values.

Zac shares the story of Kagi’s current backend, which is the second-generation:

‘After launching Kagi, we realized we needed to manage a ton of concurrent tasks, combined with even more checks and decision making. We needed to stay on top of a robust concurrent control flow, and not go crazy trying to debug it all. This was the motivator for the "second generation" backend.

‘Flow Based Programming (FBP) is the hidden gem we followed to build observable and complex concurrent systems. FBP is a methodology introduced in around the 1970s, and we use a modern variant that takes the simple ingredients:

Simple "black box" interface. One message in → multiple messages out

FIFO queues. First in, first out

Routing table. This describes how messages are transferred between components

‘These three components produce a system so "regular" that you can describe control flow with any off-the-shelf diagramming tool. Put simply, we could create visualizations of our system representing the full source of truth; we just needed to write an interpreter for it!

‘Compare this to the classic scenario of how understanding of systems becomes muddled:

Someone makes a nice diagram to help others understand a system’s inner workings.

From the start, this diagram is only going to represent someone's abstract idea of a system. This is a small enough problem early on, which most people gloss over.

As time goes on, the system changes but nobody updates the documentation, so it further drifts from truth.

Over time, the diagram no longer represents the system.

‘What we have done is establish a "domain specific language (DSL) for software architecture". As engineers, we can specify and communicate using the terms of these diagrams. We do not have to focus on small details that cause us to “not see the forest for the trees.” We can agree on the shape of a network, and leave other teammates to implement it, with fundamental guarantees of outcome, regardless of the “in the weeds” decisions they make.

‘The boundaries between messaging components become natural testing boundaries, without having to write code in a "testable" way. They lend themselves heavily to code reuse. The message passing history during a request gives us free observability in the form of tracing, with no explicit effort from a developer. The whole system works very well for us.’

3. How does Kagi index?

One of the most important parts of a search engine is its search index. Zac breaks down the process at Kagi:

‘We currently have about half a dozen different search indexes:

Teclis: index of the non-commercial web

Tinygem: index for trending news

Small Web is an index of personal blogs and websites

… and several more

‘A search index consists of three components:

Crawling

Indexing

Ranking

‘Each one comes with separate challenges:

‘Crawling the web is very different today to 20 years ago. Websites – especially big sites – are much more hostile to crawlers; probably for good reason! Luckily for us, we are not trying to crawl the entire web. Also, most websites we crawl are cooperative.

‘We do not index sites with too many ads. We crawl using a headless browser and with uBlock Origin extension installed, and count the number of ads and trackers on a page. If the figure is too high, we simply kick the page out of our index!

‘Indexing lets us do unique things most search engines don’t. The nature of our indexes is such that websites which get indexed tend to be high quality. We’ve noticed the number of ads and trackers on the page inversely correlates with the quality of content.

‘We surface the pages we consider high quality, pretty high up in our search results, which gives Kagi results a unique flavor. It also gives personal websites, blogs, and forums exposure in search, which they rarely get in other search engines.

‘Indexing and ranking utilize the newest machine learning techniques. For example, we use text embeddings, which allows us to search not just by keywords, but also by semantic meaning.

‘Kagi's indexes are focused on a very tiny portion of the web which allows us to be fast and cost effective. We hope this part of the web (not yet ruined by advertising) will grow in the future.’

4. How is Perplexity so fast?

One thing that stands out about Perplexity is that it’s very fast, despite being an AI search engine that needs to generate a lot of tokens (text) in realtime. How is it so rapid? Alex Lang explains:

‘Speed is a major focus that we optimize for because it has two major benefits:

Better UX. Quick responses dramatically improve user experience.

Lower cost. The faster an AI model produces an answer, the less computing resources it consumes. Having fast models actually reduces our infrastructure cost in serving answers!

‘Two things are worth highlighting:

We run multiple models for every query. Few people realize, but under the hood we are running multiple models to provide the final answer. We can do this because we’ve optimized to make these custom models as small as possible. We keep looking for ways to distill more knowledge into a smaller LLM. We also don’t shy away from using more traditional machine learning techniques if they give better results.

Building our custom scheduling and runtime engine. We are building this so we can have additional knobs to tune that aren’t available from third-party systems.

‘Both these approaches should unlock further speed gains; so expect Perplexity to speed up even more in the near future.’

5. Engineering practices Perplexity

Perplexity prefers lightweight processes and biases towards action, which means:

No formal design review process like design docs, RFCs, or ADRs. Most work starts with a quick Slack discussion, or a one-page document. The only exceptions are large-scale projects, which are still kicked off with a more traditional design process.

Code reviews: all PRs are reviewed before merging.

Staged rollouts: the norm across the engineering team.

Linear: used for task tracking. We published a deepdive on Linear’s engineering culture.

AI evaluations are defined upfront. This is an approach specific to working with LLMs. It’s roughly equivalent to a design review for non-AI engineering work. Engineering manager Alex Lang explains:

‘Detailed planning on the model side is a waste of time. This is because models and techniques are always changing, so any design needs to be agile and ready to switch at a moment's notice.

AI evaluations, however, are crucial to the final success of the product. Changing evaluations late in the process is extremely painful. Once we select the right evaluation, we keep this stable. We spend a lot of time and effort to go over details of the evaluation we are defining. By investing in this work upfront, we are able to guarantee the AI project will ultimately be a success.”

How does Perplexity hire? Alex Lang details the process:

‘Product sense is important for all our hires to have; we place great emphasis on this. Thanks to all engineers possessing it, the company has fewer engineering managers and product managers than many similar-sized companies. All product and engineering managers we hire are very technical, and strongly committed to empowering ICs.

We used to have trial periods, but no longer do. Early on, people were hired via trial periods. Over time, we did not find this approach scalable. We now have a fairly typical interview process.

We do have an extremely high bar. When choosing between hiring the wrong person or not hiring at all, we’d much rather not hire! For new joiners, we emphasize the importance of hitting the ground running. We want to see a positive impact in the first days, weeks, and months.’

Kagi

Here’s the engineering process at Kagi, as explained by Zac Nowicki:

‘We are ruthless in solving "papercut" pain points for our users on a weekly basis. We win so much favor with users when they report something, then wake up the next day and it’s fixed. It brings users delight, they tell their friends, and it usually doesn't take long for us to do. Everyone wins!

‘Managing larger developments has been much more of a challenge. Getting everything flowing smoothly from specification, design, and implementation is incredibly hard, with the odds against us; such as being a fully remote team spread across every major timezone.

‘We adopt a "marble-chiseling" approach of shaping things with targeted hits, like a sculptor uses their chisel. This is how it works:

Devs start out with a high-level spec, and own some large slice of the work

Devs largely self-manage their work

Reporting is done via two or three syncs throughout the week to keep successes and roadblocks as visible as possible

‘Our approach to getting stuff done is to do the most basic thing, then iterate to the next level of depth. Then repeat.

‘Process-wise, we use:

"Checkpoint" standups: a few throughout the week, to keep things grounded.

"Workspace" voice channels. Throughout the day, we have voice channels people can sit in for virtual "coworking." Sometimes this leads to collaboration, other times it's quiet. Still, we’ve found that even when working in silence, the "body doubling" effect – knowing that others are in the same channel – can be quite helpful for staying focused.

‘Trunk-based development: we rapidly integrate into a single main branch as frequently as possible. This comes with a few more approaches:

Aggressively feature flag WIP things

Refuse to work with long-lived development branches.

‘We aim to keep everyone on the same page, away from the "Git paperwork" of having to deal with complex merges.

‘As an added benefit, if you break something on the backend which goes under the radar, it’s not uncommon for someone else like a frontend developer to spot the issue. This is a lot better than what happens with long-lived branches. We think that changes stagnate by being siloed in branches until they’re "perfect", which is more harmful.

‘Linear: we currently use Linear for our overarching task & project management.’

6. Product milestones

Some notable milestones for Perplexity, and Kagi, to date:

Perplexity

Building the search index (started late 2023; ongoing)

Opening up the Perplexity API (October 2023)

Fine-tuning their own LLM models (November 2023: first launch of PPLX models)

Launching the discover feed (early 2024)

Launching Shop Like a Pro (November 2024)

Perplexity followed a smart strategy. Alex Lang elaborates:

‘Early on, it was easy to dismiss Perplexity as “just a wrapper.” And there was truth in this categorization, back then; the initial launch used third-party search results and AI models. However, people underestimated the value of this strategy.

‘As a product-led company, we wanted to get in front of real users and iterate rapidly. The fastest way to do that was to use third-party technologies. Once we understood our users better, we got to work on building:

Our own search index that matches our needs

Our own AI models that work according to our use cases

‘This “get in front of real users as soon as possible” approach remains ingrained in our engineering culture to this day. Once we’ve learned enough, we build the enduring moat that will allow us to scale.”

‘Once we had our own search index and models, we created an API to share our technology with others. The initial goal was primarily to speed up some internal work, but we knew that the same things that helped us internally, would help other developers. Since then, it's been fun to see how partners and independent developers have put our API to use.

‘Finally, a key milestone has been launching complimentary products to the original Answer Engine; honestly these could be a startup in their own right! Our users are curious, and the Discover feed has been a delightful way to satisfy that curiosity. While the Discover feed could stand on its own as a product, its true value has been its powered growth for the core Answer Engine. Our recent “Shop Like a Pro” that allows users to go from question to buying a product in one click, is a large-enough endeavour to be its own product. But this is just the first step in our evolution from an Answer Engine to an Action Engine.’

Kagi

Major product milestones:

Building Kagi Search (2022)

Launching The Assistant: a baseline LLM chat interface (2024)

Lots of small projects shipped on the side: like FastGPT, Universal Summarizer, the Small Web index, and the brand new Kagi Translate (2023-2024)

What’s next: integration of all these things, and looking for ways to have them talk to each other in ways that make sense and feel natural.

Zac has been part of the core team since the early days, and he reflected on the journey, so far. It’s really nice to hear a first-hand account of a successful product launch:

‘I am unsure anything will ever top the magic of our launch day in 2022. The months and weeks leading up to that were learning how Stripe works, designing our database schemas to hopefully last, and doing our damnedest to make sure it didn’t immediately fall flat when the clock struck midnight.

‘It's all simple strokes, but done under pressure, like:

A cloud function to catch Stripe events

Storing as little information as possible in our tables

Fallback logic so that if the Stripe event handler ever fails, we wouldn't go bankrupt thanks to incorrectly billing or refunding users

Design it with multi-user billing accounts in mind, which was only an idea at the time

‘It's not perfect and has its quirks, but it has largely been unchanged and has carried us to where we are today at ~36,000 paying customers.

‘There were fewer than five of us, each owning a slice in full depth. A formative moment for our core team (myself included), was when we taught ourselves what we could do with so little, considering the giants around us. Taking minimal dependencies, knowing exactly every risk we were taking.

It was totally unforgettable. I hope we can pull it off again.’

7. Unique challenges of search engine engineering

What are the biggest engineering challenges in building a search engine, compared to other systems the team has worked on?

Perplexity

Alex says:

‘The biggest challenge in building an Answer and Action Engine is ensuring we provide a futuristic experience, in a way that wows users, while covering all the basics a user expects from a search engine. In other words, it’s about figuring out the right ways to change up user behavior.

In the 1990s, the search experience was rapidly changing and users had not formed habits, but since then the search engine experience has been remarkably stable. So we constantly have to find a way to delight users with a surprising new experience that leverages recent technologies, while also rounding out the full product experience to cover all the tiny habits people have developed and expect to be present, in order to be satisfied with a search experience.’

Kagi

Zac reflects:

“The major difference for me is that this project connects with people like nothing else I've ever worked on.

It involves caring about details and edge cases in ways – and sometimes on such a small scale – that you wouldn't ever think of, usually. Using the publicly available APIs of other search engines, you could make something that walks and talks like Kagi; but what no one else has is the meticulous shaping of countless small rules and conditionals that keep the cruft (poorly designed, overly complicated, unwanted code) out of our results.

There's no "big secret" to what we do; we've simply put the hours into this marathon that no one else has, while staying true to our mission and value proposition to our customers, and no one else.”

Takeaways

These two startups are both in search, but are so different from each other:

Perplexity is VC-funded, growing rapidly in usage, hiring more aggressively, and launching new products at rapid speed.

Kagi: no VC funding, offering only a paid service, growing their team slowly, and being thoughtful about launching new features.

These are the most interesting points I learned about each startup:

Perplexity: hire for product sense, and prioritize in-person work. Perplexity’s engineering team emphasizes how important “product sense” is: it’s something they explicitly look for during the hiring process.

Kagi: don’t be afraid to build custom systems, or use lesser-known tools. Kagi is the first startup I’ve heard of which uses the Crystal programming language. The engineering team is unafraid to build novel systems if they decide doing so helps them get things done: this is how they built the backend, based on the Flow Based Programming methodology.

Both startups give broad ownership to engineers. Perplexity and Kagi each share the characteristic of giving engineers broad ownership and autonomy. I get the sense there is no heavy product layer at either company; engineers are encouraged to understand customers, the product, and come up with product ideas which they then build.

Both companies know they need to build a more pleasant, better search product than Google Search offers: it’s not easy, but the rewards for succeeding could be spectacular.

Each startup is building a challenger product to Google Search at a time when the Big Tech giant looks vulnerable. Will Google Search evolve to compete more with the likes of Perplexity, by becoming an “answer engine” while cannibalizing its ads revenue? Will it put customers first, and offer a paid product without ads? Or could it find itself “stuck” by being addicted to the hundreds of billions in ads revenue, thereby allowing competitors to overtake it?

Competition is great for the tech ecosystem, and challengers to Google in the search market are much needed. I hope you enjoyed learning about how these two startups operate. Best of luck to the Perplexity and Kagi teams!

As a paid subscriber, if you’d like to try out their paid products — without needing to enter your payment details — you can do so here.

If you enjoyed this deepdive: you might appreciate deepdives about about Meta, Amazon, Stripe, OpenAI, Figma and several other tech companies.

Monday, 27. January 2025

The Pragmatic Engineer

Get free access to Perplexity Pro and Kagi Ultimate

New perk for paid The Pragmatic Engineer subscribers

Hi, this is Gergely with a bonus issue of the Pragmatic Engineer Newsletter — with a special announcement. Note that this is a personal recommendation — and a deal I initiated with two tools I’ve been using as a paid customer to replace my Google Search usage the last year. It is not a paid advert: I have not received any monetary or other compensation for sharing this offer. I have no affiliation with either of the companies mentioned below. See more in my ethics statement. It’s the first-ever such announcement since the start of The Pragmatic Engineer three years ago — and I do not plan to do these frequently.

I’m excited to share a unique perk for all paying subscribers to The Pragmatic Engineer. Starting today, you now get access to:

Perplexity Pro for 12 months (worth $200)

Kagi Ultimate for 3 months (worth $75)

You don’t need to share credit card details or payment information to access this deal.

Get Perplexity and Kagi codes for free

This deal only applies to new Perplexity and Kagi customers. Cancelling a newsletter subscription deactivates the code supplied.

If you’re not yet a paid subscriber to The Pragmatic Engineer, you can upgrade here.

How did these offers come about?

Some background: for about a year, I’ve been using Kagi Professional as my default search engine, after deciding to drop Google Search. Meanwhile, for deep research and highly nuanced web search, my choice is Perplexity Pro.

I want to mention that I haven’t been paid to mention these startups – in fact, I reached out to them! I’m actually a longterm user of both products, and want to give subscribers the opportunity to discover them, too. So, I approached the companies about offering trials.

I’m pleased to say that Kagi and Perplexity were each open to offering multi-month free trials to subscribers of The Pragmatic Engineer. For Kagi, it’s the first time they’ve made such an offer via a third party.

Let’s tackle the elephant in the room: why would you pay for search? I certainly never thought I would. But the reason I do so today, is that I’m tired of being “the product” for search engines. Every time I execute a search, I have to scroll through sponsored results that are mixed into organic ones. I also have a nagging sense that search engine results are degrading over time. The likes of Google care more about maximizing ad revenue than they care about serving up high-quality, helpful search results.

Finally, as a tech professional and someone who believes in the benefits of competitive markets, it’s refreshing to see startups challenge the monopolistic Big Tech dominance of search with new, alternative approaches. Perplexity is barely two years old, and already a challenger to Google. And Kagi offers unusual transparency behind their exact usage numbers and customer numbers by sharing realtime platform stats.

Perplexity has built a product that feels like a new type of search; one which delivers research to your fingertips via a search engine, complete with references (links) about where the information has been sourced from. This lets you double check how legitimate the sources are, and so using such a tool can strengthen the credibility of your work.

Meanwhile, Kagi is building a privacy-first, user-first version of Google search without VC funding or ads. The results it serves feel much more like “hits”, than the “misses” I’ve come to expect from Google. Check out the Kagi team’s view on the real cost of “free search.”

What you get from Perplexity Pro and Kagi Ultimate

What you get with Perplexity Pro:

Pro searches. A Pro search conducts thorough research to provide in-depth, accurate responses to questions. It’s particularly useful for summarising, data analysis, debugging, and content generation, making it the perfect tool for developers and data scientists. More about Pro search features.

Powerful AI models. Search using a choice of advanced AI models, each with its own unique strengths. Choose from GPT-4 Omni (advanced reasoning), Claude 3 Sonnet and Haiku (natural-sounding responses and file uploads), Sonar Large 32k (conciseness and accuracy). You can even use the new DeepSeek R1 reasoning model (hosted with Perplexity) and OpenAI o1 reasoning model.

Document Analysis: Upload text and data files to search the web and also internal knowledge bases.

What you get with Kagi Ultimate:

Unlimited Kagi searches. Lightning-fast, no ads and no tracking, and the ability to block domains. Popular domains users block include Pinterest, Daily Mail, and Quora.

The Assistant access. The Assistant by Kagi combines top large language models (OpenAI, Anthropic, Meta, Google and Mistral) with optional results from Kagi Search, making it the perfect companion for creative, research, and programming tasks. More details about The Assistant.

Get Perplexity Pro and Kagi Ultimate

Next up: deepdive on both startups

Tomorrow, we publish a deepdive on the engineering cultures of Perplexity and Kagi. Both these startups are building alternatives to Google Search, and each has its own distinct approach and engineering culture. Stay tuned!


@_Nat Zone

1月28日(火)はPrivacy by Design Conference

Privacy by Design Conferenceとは Privacy by Design Conferenceは、Data Privacy Day (1月28日)頃に開催する国際カンファレンスです。 プライバシーに関わる、文化、法律、テクノロジー、ビジネス、オペレーション…
Privacy by Design Conferenceとは

Privacy by Design Conferenceは、Data Privacy Day (1月28日)頃に開催する国際カンファレンスです。

プライバシーに関わる、文化、法律、テクノロジー、ビジネス、オペレーションなどのさまざまな立場の方が、多様な視点で対話を行います。

ことしは日比谷国際ビル コンファレンス スクエアにて1月28日におこなわれます。わたしもひとつのセッションのモデレータをさせていただきます。

14:35〜15:20

デジタルIDによってどのような社会を実現するのか(8C会場) (英語。日本語字幕)
Session:「Secure and Safety Service Design 〜Designing with Trusted Digital ID〜」

このセッションでは、モデレーターとパネリストが国境を越えたデジタルIDプロジェクトに関するトピックについて議論します。現在のユースケースを基に、信頼できるデジタルIDスキームに基づいたセキュリティと安全性に関する課題についてお話します。

Reference Link: DNP and MUFG Aiming to Commercialize Digital IDs

モデレーター: MyData Japan 理事長 崎村 夏彦 日豪クロスボーダー相互運用性ワーキンググループ メンバー 岡本 凜太郎 氏 Meeco CEO兼創設者 カトリーナ・ダウ 氏 カンファレンスの詳細について

カンファレンスの詳細は、オフィシャルサイトからご覧になっていただけます。

それでは、現地にてお会いしましょう。

Saturday, 25. January 2025

Doc Searls Weblog

What Works After a Disaster Happens?

When Hurricane Helene hit Western North Carolina, the Swannanoa River rose three storys above its shores, all but erasing the town named after the river, and leaving hundreds homeless. But the challenge for Swannanoa was not just recovery. It was regeneration. For that, Swannanoa’s residents formed new kinds of partnerships and problem solutions that could […]

When Hurricane Helene hit Western North Carolina, the Swannanoa River rose three storys above its shores, all but erasing the town named after the river, and leaving hundreds homeless.

But the challenge for Swannanoa was not just recovery. It was regeneration. For that, Swannanoa’s residents formed new kinds of partnerships and problem solutions that could prove a model of what to do after disasters hit communities anywhere—such as Altadena and Pacific Palisades, on which I’ve been reporting lately in my News Commons series.

Kevin Jones, one of the founders of SOCAP and curator of Neighborhood Economics, is a Swannanoa resident reporting on what is working in real time. This next Wednesday, January 29 at Noon EST, Kevin will be doing exactly that in our Salon Series at Indiana University—and live on the Web.

If you care about recovery and regeneration for your neighborhood—or anybody’s—after disaster strikes, this is a one-hour required course. As a salon, it is participatory. Come with questions, answers, or helpful stories of your own. Let’s teach and learn together, for the sake of a world larger than each of our own.

Go here for more information and the Zoom link for the salon.

Photo by Bill McMannis.

Thursday, 23. January 2025

The Pragmatic Engineer

The Pulse #121: Netflix’s remarkable turnaround

In two years, the streaming giant’s valuation increased by 4x. Also: learnings from a managed database service losing customer data, and more.

The Pulse is a series covering insights, patterns, and trends within Big Tech and startups. Notice an interesting event or trend? Send me a message.

Today, we cover:

Industry pulse. TikTok’s gamble pays off, Stripe and Meta do 4-5% layoffs, Reflections on a month of using Devin, Hundreds of billions in AI infra investments, Amazon’s AI models fail to impress, Google allows malware Homebrew install links as sponsored Search results, and more.

Netflix’s remarkable turnaround. In two years, Netflix went from growth stalling – and the company’s valuation in freefall – to all-time-high user numbers and valuation. The standout stock price growth means that Netflix employees who opted to take a part of their compensation as stock options most certainly did better than those opting for all-cash packages.

Managed database service loses customer data. Database-as-a-service Turso lost customer data thanks to no backups running on beta service offerings. There are learnings not just for the startup – launching a service without backups running doesn’t feel like a responsible move – but also for all engineering teams. Don’t assume your vendor does backups: validate that they do.

1. Industry Pulse TikTok’s gamble pays off?

Read more


Doc Searls Weblog

And Now the #HughesFire

Twenty-fourth in the News Commons series Eigth on the #LAfires 7:35 am January 23, 2024—It’s morning now. The Hughes Fire is 17% contained but no longer interesting. The Sepulveda Fire broke out last night along the 405 freeway. It stopped at forty acres, and doesn’t matter much now. Here’s the path of one fire helicopter […]

Twenty-fourth in the News Commons series
Eigth on the #LAfires

7:35 am January 23, 2024—It’s morning now. The Hughes Fire is 17% contained but no longer interesting. The Sepulveda Fire broke out last night along the 405 freeway. It stopped at forty acres, and doesn’t matter much now. Here’s the path of one fire helicopter shuttling between the two fire sites:

So reporting is now off the fires themselves. Let’s look at these six screen grabs—

—and put them where they go on this table:

The rows are requirements of story-telling. The columns are stages of disaster news coverage.

In these screen grabs—

Channel 4 is doing a mix of live reporting and human interest, focusing on how some characters (school kids) are doing nice things for firefighters. Channel 5 is doing live reporting from the Sepulveda Fire, which is pretty much out. Channel 7 has a weather forecast, for rain, which is long overdue and will put a damper on the fires. (An aside: my wife observes that “weather people on TV are either crusty old dudes or hot young women.”) KCAL/9 has an interview with a fire official working toward coming to an understanding while earlier footage (recalling live coverage) shows to his right. KTTV/11 (the Fox channel) is showing some public service stuff mixed with promotional jive. It’s between news segments and regular programming, so not news at all. Telemundo (Channel 52) is reporting on a church recovering from the Eaton Fire. That’s a mix of live reporting and human interest. Blame is not in any of these. If you want some of that, here’s the latest from the LA Times.

What I’m trying to do with this table is show how the requirements of a story (character, problem, movement) are met by the four stages of story-telling by journalists. Every story you see, hear, or read will play across that grid. Of course, none matter more than #4, which is more complicated and less interesting than the other three stages. But, that is where most facts will go. For more on that, see How Facts Matter.

8:10 pm—The LA Times reports that the I-5 is open again. (That link goes to their ongoing coverage.) Here is what CAL FIRE says about current fires and their containment:

Note that the Palisades and Eaton Fires are still not fully contained, though we haven’t been worried about them in a while. The same will happen with the Hughes Fire. While it is currently 0% contained, that does not mean it hasn’t been slowed or halted along stretches of its perimeter. Full containment means firefighters are sure the fire will not spread beyond control lines. And they are very cautious about the numbers they share. The winds later tonight and tomorrow are still a concern, and thousands of people remain evacuated.

5:40 pm—I just heard on one of the TV channels that The I-Five, aka The Grapevine, is likely to be opened again soon. All the channels sound more optimistic about the paths of the fire. Smoke is also flowing upwards rather than sideways, so that’s less of a concern as well.

Night has also fallen, meaning the flames are far more dramatic:

Great TV, but not more to worry about.

3:45 pm—Watching the Los Angeles TV stations (each with a tab in my browser) cover the Hughes Fire. I have them arranged by channel number. (Note that NBC’s 4 and Telemundo’s 52 are co-teamed. One reporter’s mike says NBC4 on two sides and T—for Telemundo—52 on two sides.) Including the above, here are all the media we’re watching, reading, and listening to:

KNBC/4 KTLA/5 KABC/7 KCAL/9 KTTV/11 KVEA/52 LA Times LAist #HughesFire on Bluesky #HughesFire on Facebook #hughesfire on Instagram #HughesFire on Threads #HughesFire on X

I’m also watching aerial fire fighting (aircraft, paths) on adsbexchange.com and FlightRadar24. Linked images from each:

On the plus side, the concerns here are for outer suburbs and exurbs on a landscape of hilly grassland and chaparral. Compared to the steeper San Gabriels (such as those above Pasadena and Altadena), the civilized stuff is relatively defensible, with fire breaks on the ridges and bare spaces around at least some of the housing developments and other collections of structures. Winds are not too strong for aircraft, which have been able to drop water and lines of retardant. It helps that Castaic Lake itself is an enormous fire break, as well as a convenient source of water to scoop up and drop on the fire.

So, while there is lots of danger here, especially if the winds get too high for aircraft to keep flying, I expect the fight against the HughesFire to succeed.

If we can back off the edges of our seats a bit, it should be interesting to look at coverage of the #HughesFire through the prism of story. Again, the three requirements are:

Character(s) Problem(s) Movement

And the four stages of coverage are:

Live reporting Human interest Blame Coming to an understanding

At this stage, the main character is the Hughes Fire, which is also the problem. And it’s moving. Of course all the media like to bring the human characters in: a resident, a fire captain, a county official, a store owner. Not long ago one of the TV stations interviewed an insurance guy holding a book full of checks he was ready to write to policyholders. That’s also what we get with the mix of the first and second kinds of coverage. No blame yet, but (as they say in social media reels) wait for it. Understanding, of course, will take the longest. Here’s a screen grab from Google Maps that will tell at least some of what isn’t much of a story yet (not much character or movement), but there is an expression for it what you see: the Wildland-Urban Interface (WUI). Read about it at that link. See it here, in the path of the Hughes Fire if it isn’t stopped:

And here is a wider view of the whole region:

With the exception of the Chiquita Canyon Landfill, all of the bare areas are housing developments. All of the housing and industrial developments used to be bare areas. And all of both used to be desert mountain grasslands and chaparral. Throughout the natural history of the area, at least since the end of the Pleistocene, burns from natural causes cleared off the land about every twenty-five years or so. And all the mountains were produced by earthquakes. I’ll write more about all of this in my next post.

Wednesday, 22. January 2025

The Pragmatic Engineer

Observability: the present and future, with Charity Majors

In today's episode of The Pragmatic Engineer, I'm joined by Charity Majors, a well-known observability expert – as well as someone with strong and grounded opinions.
Stream the Latest Episode

Available now on YouTube, Apple and Spotify. See the episode transcript at the top of this page, and a summary at the bottom.

Brought to You By

Sonar — Trust your developers – verify your AI-generated code.

Vanta —Automate compliance and simplify security with Vanta.

In This Episode

In today's episode of The Pragmatic Engineer, I'm joined by Charity Majors, a well-known observability expert – as well as someone with strong and grounded opinions. Charity is the co-author of "Observability Engineering" and brings extensive experience as an operations and database engineer and an engineering manager. She is the cofounder and CTO of observability scaleup Honeycomb.

Our conversation explores the ever-changing world of observability, covering these topics:

• What is observability? Charity’s take

• What is “Observability 2.0?”

• Why Charity is a fan of platform teams

• Why DevOps is an overloaded term: and probably no longer relevant

• What is cardinality? And why does it impact the cost of observability so much?

• How OpenTelemetry solves for vendor lock-in

• Why Honeycomb wrote its own database

• Why having good observability should be a prerequisite to adding AI code or using AI agents

• And more!

Takeaways

My biggest takeaways from this episode:

1. The DevOps movement feels like it’s in its final days, having served its purpose. As Charity put it:

“It’s no longer considered a good thing to split up a dev team and an ops team to then collaborate, right? Increasingly, there are only engineers who write code and own their code in production. And I think this is really exciting. We can understand why Dev versus Ops evolved, but it was always kind of a crazy idea that half your engineers could build the software and the other half would understand and operate it.”

Indeed, I cannot name any team at startups or at large tech companies that has a dedicated Ops team. While there surely exist such companies in small pockets – think of more traditional companies operating in highly regulated environments like finance or healthcare – this setup feels like the exception rather than the norm.

2. Lots of people get dashboards wrong! Charity doesn’t think that static dashboards are helpful to engineering teams at all. In fact, misusing dashboards is one of the most common observability practices she sees:

“Unless your dashboard is dynamic and allows you to ask questions, I feel like it's a really poor view into your software. You want to be interacting with your data. If all you're doing is looking at static dashboards, I think it limits your ability to really develop a rich mental model of your software. And this means that there are things that you won’t see; because you did not graph it on your dashboard!”

3. Observability will be especially important for AI use cases in these ways:

o11y for LLMs: to get data on how they behave and to be able to debug behaviors. This is relevant for teams building and operating AI models.

o11y for code generated by AI: the generated code should have the right amount of observability in place. Once the code is deployed to production, developers need to be able to get a sense of how the code is behaving there!

GenAI means that a lot more code will be generated via LLMs – and all this code needs observability!

The Pragmatic Engineer deepdives relevant for this episode

How Uber Built its Observability Platform

Building an Observability Startup

How to debug large distributed systems

Shipping to production

Timestamps

(00:00) Intro

(04:20) Charity’s inspiration for writing Observability Engineering

(08:20) An overview of Scuba at Facebook

(09:16) A software engineer’s definition of observability

(13:15) Observability basics

(15:10) The three pillars model

(17:09) Observability 2.0 and the shift to unified storage

(22:50) Who owns observability and the advantage of platform teams

(25:05) Why DevOps is becoming unnecessary

(27:01) The difficulty of observability

(29:01) Why observability is so expensive

(30:49) An explanation of cardinality and its impact on cost

(34:26) How to manage cost with tools that use structured data

(38:35) The common worry of vendor lock-in

(40:01) An explanation of OpenTelemetry

(43:45) What developers get wrong about observability

(45:40) A case for using SLOs and how they help you avoid micromanagement

(48:25) Why Honeycomb had to write their database

(51:56) Companies who have thrived despite ignoring conventional wisdom

(53:35) Observability and AI

(59:20) Vendors vs. open source

(1:00:45) What metrics are good for

(1:02:31) RUM (Real User Monitoring)

(1:03:40) The challenges of mobile observability

(1:05:51) When to implement observability at your startup

(1:07:49) Rapid fire round

A summary of the conversation

For those of you more interested in reading a summary of the conversation — or skimming over it — see it here. Takeaways follow after the summary.

Observability (o11y) basics

Observability is about understanding software, specifically the intersection of code, systems, and users.

It is not just about errors, bugs and outages; it is also about understanding the impact of code.

Observability is a tool that is critical for development feedback loops, and is not just an operational tool.

The goal of good o11y is to help engineers understand their software in the language of the business.

Engineers should be able to tie their work back to top-level goals, and explain how their work translates to the business.

Sampling is an important lever, contrary to the idea that every log is sacred.

‘metrics’ vs ‘Metrics’

We need to distinguish between metrics (small 'm') as a generic term for telemetry and Metric (capital 'M') as a specific data type, a number with appended tags.

The Metric data type is limited because it doesn't sort any contextual relationship data.

The Three Pillars Model

The three pillars model of observability is this: metrics, logs and traces.

Many vendors sell products for each of these pillars – as well as for all of them

The problem with the Three Pillars Model is that every request that enters a system is stored multiple times, in different tools (metrics, logs, traces, profiling, analytics).

There is little to connect the data points; engineers are left to manually correlate the data.

The cost of following this model is high: it’s because storing the same data in multiple tools and databases is very high!

What is Observability 2.0?

Observability 2.0 moves away from multiple sources of truth to unified storage.

With unified storage, there are no dead ends: engineers can click on a log, turn it into a trace, visualize it over time, and derive metrics and SLOs from it. They can then see which events are violating SLOs.

Good observability powers developer feedback loops. It allows engineers to visualize the CI/CD as a trace and see where tests are breaking. The goal is to keep the time between building code and seeing it in production as small as possible.

Observability is shifting from being an ops tool, focused on errors and downtime to something that supports the entire development cycle.

Modern engineering practices + good observability is where the real value is.

Modern engineering practices such as feature flags, progressive deployment, and canary releases, along with observability, give engineers confidence to move quickly and safely.

Observability acts as a translation layer, enabling engineers to reason about their work and tie it back to top-level business goals.

The dream goal? To be able to explain and understand our work in the same language as everyone else: how much financial value is this piece of code generating?

Why is observability hard, anyway?

Engineers have to think about what they might need to understand in the future. Like during an incident at 2:00 AM!

Software is hard. Observability is the first line of defense.

Tools have historically required engineers to be masters of multiple disciplines, e.g., they had to convert their code into physical resources such as CPU and RAM usage.

Cost of Observability: why is it so expensive?

One reason observability is expensive: the multiplier effect. The same data is stored multiple times. One common criticism of The Three Pillars model.

Cardinality: another thing that can make it a lot more expensive

Cardinality means to the number of unique items in a set. Unique IDs, such as request IDs, have the highest possible cardinality.

Big 'M' Metrics tools are designed to handle low-cardinality data (Observability 1.0 tools)

Adding high cardinality data to metrics tools makes them very expensive.

These days, world-class observability teams spend the majority of their time governing cardinality!

The more unique the data, the more valuable it is for debugging but that also means it costs more.

To solve this, the industry has to move away from tools backed by big 'M' metrics, to those using structured data where high cardinality can be stored.

The wider the logs (the more context attached to each event), the better the ability to identify outliers and correlate data.

Is Observability 1.0 getting in the way or building what engineering needs – at a lower cost?

The model for traditional observability tools does not fit the needs for the data that engineers actually need.

Metrics were optimized for a world where resources were very expensive, but now that storage and compute is cheaper, it's possible to store more data and slice and dice in realtime.

A column-based data store is needed to use flexible structured data without having to define indexes and schemas in advance.

OpenTelemetry

What is OpenTelemetry (OTel)?

A collection of APIs, SDKs and tools to make telemetry portable and effective.

It provides a framework for consistent telemetry with consistent naming and semantic conventions, allowing vendors to do more with the data.

OTel overtook Kubernetes as the number one project in the CNCF.

The goal of Otel is to allow engineers to instrument code once, and then point the data to whatever vendor is chosen.

OTel forces vendors to compete on the basis of their excellence and responsiveness.

Using OpenTelemetry is a safe bet for companies to enable portability of data between vendors.

It also gives the option of negotiating with vendors, because of the ability to switch!

Common mistakes with observability

Introducing it too late. Engineers feel like they don't need observability until they are in production and things start breaking.

Using dashboards wrong.

Engineers can get too attached to dashboards.

Dashboards, unless they are dynamic and allow you to ask questions, are a poor view into software.

Not using SLOs and error budgets enough.

SLOs (Service Level Objectives) should be the entry point, not dashboards.

SLOs are the APIs for engineering teams.

SLOs provide a budget for teams to run chaos engineering experiments.

SLOs are a hedge against micromanagement, because when teams meet their SLOs, the way they spend their time is not important.

SLOs allow teams to negotiate for reliability work if they are not meeting their obligations.

SLOs need to be derived from the same data as debugging.

Other topics

Why did Honeycomb build their own database?

At Honeycomb, Charity decided to build their own database despite the common wisdom to never do it. ClickHouse wasn't a thing back then: if it was, perhaps they would have not built it.

The database, called Retriever, is a column-based store. The query planner runs using Lambda jobs. Data is aged out to S3 after being written to SSDs.

It’s been a win, looking back now. The data model is custom, and being able to iterate on it has been a force multiplier.

Observability and AI

AI intersects with observability in three areas:

1. When building and training models

2. When developing with LLM

3. When dealing with code of unknown origin produced by AI

Good AI observability can't exist in isolation; it must be embedded in good software observability.

The inputs for AI models come from different services, data and humans and this creates a trace shaped problem

Build vs Buy vs Open Source

The main trend across the industry: consolidation. Companies try to control their bills.

Most companies use vendors and don't want to deal with observability tools breaking at 2am.

Metrics still have a place, but most companies need to move from 80% metrics/20% structured data to the reverse.

Frontend and mobile observability

Silos are created when different teams use different tools.

A unified view from mobile/browser to database is powerful.

Mobile is different because the build pipeline is different, and the inability to fold mobile into software development best practices.

Resources & Mentions

Where to find Charity Majors:

• X: https://x.com/mipsytipsy

• LinkedIn: https://www.linkedin.com/in/charity-majors/

• Blog: https://charity.wtf/

Mentions during the episode:

• Honeycomb: https://www.honeycomb.io/

• Parse: https://parseplatform.org/

• Ruby on Rails: https://rubyonrails.org/

• Christine Yen on LinkedIn: https://www.linkedin.com/in/christineyen/

• Scuba: Diving into Data at Facebook: https://research.facebook.com/publications/scuba-diving-into-data-at-facebook/

• Three pillars: https://charity.wtf/tag/three-pillars/

• Unified storage: https://charity.wtf/tag/unified-storage/

• “Every Sperm is Sacred”:

• Peter Borgan on LinkedIn: https://www.linkedin.com/in/peterborgan/

• Datadog: https://www.datadoghq.com/

• Vertica: https://en.wikipedia.org/wiki/Vertica

• Ben Hartshorne on LinkedIn: https://www.linkedin.com/in/benhartshorne/

• Cardinality: https://en.wikipedia.org/wiki/Cardinality_(data_modeling)

• COBOL: https://en.wikipedia.org/wiki/COBOL

• Ben Sigelman on LinkedIn: https://www.linkedin.com/in/bensigelman/

• OpenTelemetry: https://opentelemetry.io/

• Kubernetes: https://www.cncf.io/projects/kubernetes/

• SLOs: https://docs.honeycomb.io/notify/alert/slos/

• ClickHouse: https://clickhouse.com/

• Why We Built Our Own Distributed Column Store: https://www.honeycomb.io/resources/why-we-built-our-own-distributed-column-store

• "Why We Built Our Own Distributed Column Store" by Sam Stokes:

• "How we used serverless to speed up our servers" by Jessica Kerr and Ian Wilkes:

• Inside Figma’s Engineering Culture: https://newsletter.pragmaticengineer.com/p/inside-figmas-engineering-culture

• How to debug large, distributed systems: Antithesis: https://newsletter.pragmaticengineer.com/p/antithesis

• Observability in the Age of AI: https://www.honeycomb.io/blog/observability-age-of-ai

• Grafana: https://grafana.com/

• Prometheus: https://prometheus.io/

• What Is Real User Monitoring (RUM)?: https://www.honeycomb.io/getting-started/real-user-monitoring

• Crashlytics: https://en.wikipedia.org/wiki/Crashlytics

• Square wheels comic: https://alexewerlof.medium.com/on-reinventing-the-wheel-201148f74642

• WhistlePig Whiskey: https://www.whistlepigwhiskey.com/

• George T. Stagg bourbon: https://www.buffalotracedistillery.com/our-brands/stagg.html

• Stagg Jr.: https://newportwinespirits.com/products/stago-jr-ksbw

• Fluke: Chance, Chaos, and Why Everything We Do Matters: https://www.amazon.com/Fluke-Chance-Chaos-Everything-Matters/dp/1668006529

Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com.


Phil Windleys Technometria

Zero Trust, Least Privilege, and Just-in-Time Access

Security professionals have long taught that organizations should limit the access individuals have in computer systems to just those privileges necessary to perform their job.

Security professionals have long taught that organizations should limit the access individuals have in computer systems to just those privileges necessary to perform their job. This is known as the principle of least privilege. The problem is that knowing this is a best practice and putting it into practice are two different things. Traditionally, organizations have used access control lists (ACLs) and role-based access control (RBAC) based on groups to authorize activities. These methods are static, requiring that lists or groups be updated manually as people move between roles or leave employment. When this is extended to partners, suppliers, and other services, the problem is that much worse. And excess standing privileges are a real security risk.

Standing privileges are characterized by persistent access regardless of whether the person is using the resource or not, predefined scope where role is used to define broad access, and minimal oversight with little monitoring and accountability. Standing privileges increase the attack surface, letting attackers exploit broad permissions without further escalation. In addition, over time people accumulate permissions beyond what they need to perform their current job, a situation known as privilege creep.

In an ideal world, least privilege is like a shrink wrap, constantly adjusting the access perimeter as the needs of the employee change. Sometimes they expand and the shrink wrap expands to seamlessly grant any access to needed perform a job. Sometimes the needs shrink and the access perimeter of the employee contracts as well. By limiting access to just that necessary to perform tasks, least privilege ensures that the attack surface that an attacker can exploit is as small as possible.

Zero Trust

Sometimes it's easy to get least privilege and zero trust confused. Zero trust is an overall security framework that requires continuous trust verification. Zero trust is a strategic, overarching trust model for an entire organization. Least privilege, in contrast, is more tactical, determining who can access what and when they can do it.

To see how least privilege fits into a zero trust strategy, consider a database administrator (DBA). Least privilege might set access controls such that the DBA can manage databases, but cannot view customer data stored within them. If their credentials are compromised, the attacker cannot steal sensitive information because the DBA’s privileges are tightly scoped. Zero trust relies on the DBA's access being valid, but might also check for unusual activity such that if the DBA appears to be accessing the database from a new location or at an unusual time, access is curtailed until the DBA is re-authenticated.

As the example shows, least privilege is an important part of zero trust, but only part. Other tactics that play in a zero trust strategy include device management, multi-factor authentication, and segmenting applications and processes (i.e., microsegmentation) to make fine-grained access control possible. Still, least privilege is a core part of a zero trust strategy. And least privilege depends on eliminating standing privileges.

Eliminating Standing Privileges

Recent developments in dynamic authorization have made it easier than ever to eliminate standing privileges. Standing privileges are the result when automatically updating an employee's access in response to changing circumstances is difficult. Modern policy-based authorization systems like Cedar allow organizations to control access though policies that state who can access what and under what conditions. These policies are managed like code, making them easier to manage. More importantly, they can automatically respond to changing circumstances.

For example, the first of the following two Cedar policies forbids anyone to access any resource that's considered "critical" unless they're on a managed device with an acceptable security posture. The second permits employees in the Finance group to access any finance application. These policies work together, so that if the finance application is also a critical resource, then finance employees would also have to be on a managed device with an acceptable security posture. As employees change roles that group is automatically updated from the HR system, growing or shrinking access as needed.

forbid( principal, action, resource in Category::"CriticalResources" ) unless { context.device.security.risk == "Secure" && context.device.security.assessment.overall >= 80 }; permit( principal in UserGroup::"Finance" action, resource in Category::"FinanceApplications" );

While policy-based access control (PBAC) can be used to mimic a role-based access control system, proper system segmentation (i.e. being able to identify finance applications) and device management allows finer-grained access control where employee's actions can be restricted to specific systems and only when their using a company-owned device that is properly secured. Access control can be limited to certain times or from specific locations. All of these reduce the surface area of an employee's access perimeter for better security.

Just-in-Time Access Control

We can shrink the access perimeter even further using just-in-time access control. Just-in-time (JIT) access control is a mechanism that grants users, applications, or processes access to resources only for a limited time and only when it is explicitly needed. This minimizes the risks associated with long-term or persistent access privileges. While the principle of least privilege focuses on ensuring users or systems have the minimum permissions required to perform their roles or tasks, JIT access control refines this further with several other important features:

Time-Bound Privileges:—JIT ensures permissions are time-limited, reducing the window of opportunity attackers have to exploit rarely-used accounts. For example, a developer needing access to a production server might receive elevated privileges only for a specific task and only for a set duration.

Dynamic Privilege Assignment:—Even when policies are evaluated for each access, the permissions they allow are available indefinitely. JIT adds another dimension to the dynamic nature of PBAC where privileges are granted on demand and revoked automatically when no longer needed.

Granular Control:—Dynamic privilege assignment complements PBAC by making access more granular—not just who can access what, but also when and for how long.

JIT access control might, for example, be used to limit access to payroll functions except during the window when payroll is being processed. Another example is in so-called "break-glass" situations where access is granted to production resources in an emergency or in the case of unexpected failure. Other examples use dynamic roles (e.g., on call) to grant access or require a third party (like a manager) to approve access.

These scenarios can fit within a policy-based authorization architecture using mechanisms such as dynamically altering roles or adding information to the authorization request context based on querying other systems or utilizing permission tokens that carry time-based permissions with them. For example, the following policy makes use of an assignment to process payroll in the oncall tracking system to ensure only people with an oncall assignment of "payroll" can process the payroll.

forbid( principal action == "Payroll::Process", resource ) unless { context.oncall.assignment == "payroll" }

For this to work, the authorization agent that creates the request for the policy authorizer has to ensure that the context for the request includes the correct oncall assignment and oncall assignments need to be automated.

Putting it All Together

Zero trust, the principle of least privilege, and just-in-time access work together to create a comprehensive security framework.

Zero trust enables an overarching strategy that mandates systems, resources, and actions be designed such that it is possible to continuously verify every action.

Principle of Least Privilege forms the heart of the zero trust strategy, mandating authentication and authorization systems that can dynamically grant fine-grained access through policy.

Just-in-time authorization augments the authorization system so that permissions can be granted not just based on who and where, but also only when they are needed to perform critical actions.

The well known adage that you can't buy security, applies to zero trust and JIT. There are products and technologies that make just-in-time, dynamic fine-grained access control possible, but besides putting them in place, you must also integrate them, make any necessary changes to other systems, and implement governance, monitoring, and auditing to ensure they work. These are organizational changes that take time, money, and perseverance. Done right, the payoff is not just a reduced attack surface and better security, but more productive employees as well. This may seem counter-intuitive, but poorly implemented, piecemeal security measures put much of the burden for keeping systems safe on employees who deal with tactics like frequent, yet ineffective password changes or requesting, and then waiting for, permissions to do their job.

When dynamic access control with JIT access is thoughtfully implemented, you shift the burden of security from employees to systems that automate protection, making it proactive and intelligent. Reducing friction so that employees can do their job while also enhancing security requires balance and a holistic approach that aligns technology, process, and culture. This transformation requires real effort but offers substantial rewards: resilient, secure systems; empowered employees; and peace of mind.

Photo Credit: Shrink Wrapped Computers from DALL-E (public domain) Prompt: draw a wide format picture of several colorful, shrink wrapped computers and smartphones.

Tuesday, 21. January 2025

The Pragmatic Engineer

Microsoft’s developer tools roots

Big Tech and startups, from the inside. Highly relevant for software engineers and managers, useful for those working in tech.

Ask 10 startups or scaleups if they use the “Microsoft stack”, and I’d be willing to bet that almost none do. By “Microsoft stack”, I mean using any one of C# or F# as a programming language, the .NET framework, ASP.NET for web development, or SQL Server as a database. However, if you asked the same companies if they use one or all of Visual Studio Code, GitHub, Copilot, TypeScript, or npm, then all may reply that they do.

The questions might seem different, but they are in fact almost identical. VS Code, GitHub, TypeScript, npm are all technologies owned and operated by Microsoft which are increasingly popular with startups and modern tech companies, but just don’t have strong Microsoft branding.

The Windows maker is quietly investing more in developer tools than any Big Tech giant. It acquired GitHub for $7.5B in 2018. It likely spent hundreds of millions of dollars per year to develop Visual Studio Code – then and then gave it away for free while upselling some of its services. It also offers GitHub Copilot for $10/month or $100/year, which may currently be a loss-making price point.

So, why is Microsoft investing so much into developer tooling, and why does the rest of Big Tech seem comparatively uninterested in competing in this area? This question has been bugging me, so I attempted to find an answer. In order to properly understand what’s happening today, it’s useful to go back in time to when Microsoft became a highly profitable “dev tools monopoly” once before, in the 2000s.

This deep dive covers the first part of the story, covering:

A programming language interpreter company. Microsoft started out as a company selling BASIC interpreters to various hardware companies.

More and better developer tools. Following the success of MS-DOS and Windows, the company still built developer tools like Microsoft C, QuickC and MFC.

Journal for developers. Microsoft Systems Journal (MSJ). In the same year as going public, Microsoft started a printed magazine for MS-DOS and Windows developers.

Visual C++, a Development System for Windows. Microsoft created this IDE to make it easier to develop apps for Windows.

Visual Studio. The first IDE by the company that supported multiple programming languages.

Microsoft Developer Network (MSDN). Microsoft managed to figure out how to charge a large premium for quality documentation and access to the latest software.

“Developers, developers, developers!” This now-famous chant had more context: Microsoft knew that it needed developers to adopt the newly launched .NET framework, to make it a technology adoption success.

1. A programming language interpreter company

Most people associate Microsoft with the ubiquitous Windows operating system, but the company actually began by creating a language interpreter.

Building a BASIC interpreter in two months

On New Year’s Day in 1975, an advert for a minicomputer appeared on the front page of Popular Electronics, then the world’s most popular electronics magazine. It was for the Altair 8800; a small computer with an Intel 8080 CPU. The price was $439 (around $2,500 today) and the Altair couldn’t do much by itself: it had to be extended with memory (up to 4KB), and additional interface boards needed to be purchased to make it practical to use, like a typing board, cassette tapes, floppy disks, etc. All these were separate purchases, and some weren’t available at launch.

The price was incredibly low for its time. Before the Altair 8800, computers that were equally capable cost several times more, whereas the Altair was accessible to hobbyists. Its manufacturer, MITS (Micro Instrumentation and Telemetry Systems), hoped to sell 200 units, but sold 2,500 units in five months, and the device became the first commercially successful computer.

The ad announcing the Altair 8800 in Popular Electronics

The advert caught the attention of programmers Bill Gates and Paul Allen and Paul Allen, who predicted the device would quickly become popular, and spied an opportunity to develop software for it. They contacted MITS offering to develop an interpreter for a popular programming language called BASIC (Beginner's All-Purpose Symbolic Instruction Code). It’s a simple enough language:

INPUT "Enter the value of n: ", N IF N <= 1 THEN PRINT N : END A = 0 B = 1 FOR I = 2 TO N C = A + B A = B B = C NEXT I PRINT "The", N, "th Fibonacci number is:", B END

BASIC code to calculate the nth Fibonacci number

Allen and Gates did not have an Altair computer, but realized time was of the essence; so Allen wrote an Altair simulator based solely on the Intel manual for the 8080 chip (!!) Meanwhile, Gates wrote the BASIC interpreter to run on this simulator. In March, Paul Allen flew to meet the manufacturer, and demonstrated that it worked flawlessly: he loaded the interpreter into the machine, and then proved correctness by typing in simple programs like printing 2+2, sums of numbers, squares of numbers, and so on.

MITS was interested in distributing BASIC with the Altair, as the software would make their kit more useful for developers. This was the point when Allen and Gates created a corporate entity to do business: on 4 April 1975 they named and founded “Microsoft.” The company’s first contract was a licensing agreement to allow Altair to distribute their version of BASIC under the name “Altair BASIC.”

Microsoft BASIC

Microsoft’s emergence coincided with the start of a boom in affordable personal computers; every new hardware wanted to ship software to be more appealing to professional and hobbyist developers. A BASIC interpreter made all such systems easier to sell, and Microsoft developed and licensed many more BASIC interpreters to other manufacturers, including to Apple for the Apple II, to IBM, the Z-80, and the Commodore 64.

Smartly, these versions of BASIC were called “Microsoft BASIC” and the company extended the functionality of BASIC, adding improved string manipulation and better graphics support. Plus, the “Microsoft” brand name was displayed whenever the interpreter started up, which raised the company’s profile with users.

The first version of Microsoft Basic for the Apple II was on a cassette. Source: Global Nerdy 2. More and better developer tools

Microsoft’s commercial breakthrough began with creating and distributing the DOS operating system in 1981, and then the Windows operating system in 1985. In 1990, Microsoft launched Windows 3.0 and also introduced Microsoft Word and Microsoft Excel. This combination of an operating system and word processor helped Windows gain more popularity, and made the OS even more popular. Here’s how sales of the different versions compared:

Read more


Wrench in the Gears

Andreessen and Horowitz’s Debrief On Their Fairshake PAC and Sunny Days Ahead For “Little Tech” Under Trump

I took notes on Marc Andreessen and Ben Horowitz’s post-election discussion of their Fairshake pro-crypto Pac and the coming Trump presidency about two months ago, but set them aside and other topics took precedence. Given the rising craziness in the crypto space around the inauguration, I thought it was a good time to put them [...]

I took notes on Marc Andreessen and Ben Horowitz’s post-election discussion of their Fairshake pro-crypto Pac and the coming Trump presidency about two months ago, but set them aside and other topics took precedence. Given the rising craziness in the crypto space around the inauguration, I thought it was a good time to put them out for consideration. A quick glance at the finances of their pro-crypto / pro-blockchain Pac should be a wake up call to anyone doubting the rising tide of automated, smart contract law and global cyber governance. We need to wrap our heads around the fact that it’s less about money and power and more about information theory, token engineering, and the shift of social systems into a networked super intelligence operating under the United Nations Sustainable Development Goals for expanded “human potential.” It is a bipartisan effort. It is an international effort. It will be framed as a “populist uprising” to take back “the commons,” engineered of course by the world’s largest asset holders. In my opinion, meme-coins are biosemiotic aides facilitating emergent swarm intelligence. Tread carefully around those digital mind viruses folks. A lot of money, and tokens, are being thrown around trying to get their planned cyberphysical-sociotechnical future to stick. 

 

Source: https://web.archive.org/web/20250118220652/https://www.followthecrypto.org/committees/C00835959 Source: https://embed.kumu.io/57d6a066cd053fc3241a8ee0c984a34e#untitled-map?s=bm9kZS1JVzhhQVBxVg%3D%3D

Ripple is a major donors to the Fairshake Pac. See relation to digital ID and MIT Media Lab / social physics and automated legal systems.

My video recorded today on Trump’s second inauguration:

The Why Files “irradiated wolves” video: 

The A16Z discussion comparing the Biden / Trump positions on crypto and emerging tech: 

An older presentation on Laura Arrillaga Andreessen (Marc’s wife) and social impact finance at Stanford:

Two older presentations on next-gen nuclear being promoted vigorously by A16z:

Monday, 20. January 2025

Damien Bod

ASP.NET Core user application access token management

This article looks at management application access tokens in an ASP.NET Core web application. Any application with or without a user can use application access tokens as long as the application can persist the tokens in a safe way. Code: https://github.com/damienbod/token-mgmt-ui-application Blogs in this series Setup The ASP.NET Core web application authenticates using OpenID Connect […]

This article looks at management application access tokens in an ASP.NET Core web application. Any application with or without a user can use application access tokens as long as the application can persist the tokens in a safe way.

Code: https://github.com/damienbod/token-mgmt-ui-application

Blogs in this series ASP.NET Core user delegated access token management ASP.NET Core user application access token management ASP.NET Core delegated OAuth Token Exchange access token management Setup

The ASP.NET Core web application authenticates using OpenID Connect and OpenIddict as the secure token server. The application needs to use data from an app-to-app resource. An OAuth client credential flow is used to get an application access token to access the API. The OAuth client credentials flow can only be used when it can keep a secret. This token has nothing in common with the delegated access token from the user authentication. The application is persisted once for the application. An in-memory cache is used for this. The application sends the application access token as a bearer token to the API.

What must an application manage?

An access token management solution must ensure that tokens are securely stored per application for application tokens and updated after each UI authentication or refresh. The solution should be robust to handle token expiration, function seamlessly after restarts, and support multi-instance deployments. The tokens must be persisted safely in multiple instance setups. Additionally, it must effectively manage scenarios involving invalid or missing access tokens.

Properties of token management in the solution setup: The access token is persisted per application The token expires The token needs to be persisted somewhere safely (Safe and encrypted storage if not in-memory) The solution must work after restarts The solution must work for multiple instances when deployed to multi-instance deployments. The solution must handle invalid access tokens or missing access tokens Implementation example

An ApplicationAccessTokenCache service is used to manage the access tokens for the application. The service is registered as a singleton and runs once for the whole application. Each request scope can use this. The application looks in the cache for a valid token and if no valid token is present, the service requests a new access token using the OAuth client credentials flow. The token is persisted to the cache using the client ID. This means only one token can exist per client definition.

using IdentityModel.Client; using Microsoft.Extensions.Caching.Distributed; using System.Text.Json; namespace Ui; /// <summary> /// Cache persists token per application /// </summary> public class ApplicationAccessTokenCache { private readonly ILogger<ApplicationAccessTokenCache> _logger; private readonly HttpClient _httpClient; private readonly IConfiguration _configuration; private static readonly object _lock = new(); private readonly IDistributedCache _cache; private const int cacheExpirationInDays = 1; private class AccessTokenItem { public string AccessToken { get; set; } = string.Empty; public DateTime ExpiresIn { get; set; } } public ApplicationAccessTokenCache( IConfiguration configuration, IHttpClientFactory httpClientFactory, ILoggerFactory loggerFactory, IDistributedCache cache) { _configuration = configuration; _httpClient = httpClientFactory.CreateClient(); _logger = loggerFactory.CreateLogger<ApplicationAccessTokenCache>(); _cache = cache; } public async Task<string> GetApiToken(string clientId, string scope, string secret) { var accessToken = GetFromCache(clientId); if ((accessToken != null) && (accessToken.ExpiresIn > DateTime.UtcNow)) { return accessToken.AccessToken; } _logger.LogDebug("GetApiToken new from secure token server for {clientId}", clientId); var newAccessToken = await GetInternalApiToken(clientId, scope, secret); AddToCache(clientId, newAccessToken); return newAccessToken.AccessToken; } private async Task<AccessTokenItem> GetInternalApiToken(string clientId, string scope, string secret) { try { var disco = await HttpClientDiscoveryExtensions.GetDiscoveryDocumentAsync( _httpClient, _configuration["OpenIDConnectSettings:Authority"]); if (disco.IsError) { _logger.LogError("disco error Status code: {discoIsError}, Error: {discoError}", disco.IsError, disco.Error); throw new ApplicationException($"Status code: {disco.IsError}, Error: {disco.Error}"); } var tokenResponse = await HttpClientTokenRequestExtensions.RequestClientCredentialsTokenAsync(_httpClient, new ClientCredentialsTokenRequest { Scope = scope, ClientSecret = secret, Address = disco.TokenEndpoint, ClientId = clientId }); if (tokenResponse.IsError) { _logger.LogError("tokenResponse.IsError Status code: {tokenResponseIsError}, Error: {tokenResponseError}", tokenResponse.IsError, tokenResponse.Error); throw new ApplicationException($"Status code: {tokenResponse.IsError}, Error: {tokenResponse.Error}"); } return new AccessTokenItem { ExpiresIn = DateTime.UtcNow.AddSeconds(tokenResponse.ExpiresIn), AccessToken = tokenResponse.AccessToken! }; } catch (Exception e) { _logger.LogError("Exception {e}", e); throw new ApplicationException($"Exception {e}"); } } private void AddToCache(string key, AccessTokenItem accessTokenItem) { var options = new DistributedCacheEntryOptions() .SetSlidingExpiration(TimeSpan.FromDays(cacheExpirationInDays)); lock (_lock) { _cache.SetString(key, JsonSerializer.Serialize(accessTokenItem), options); } } private AccessTokenItem? GetFromCache(string key) { var item = _cache.GetString(key); if (item != null) { return JsonSerializer.Deserialize<AccessTokenItem>(item); } return null; } }

The ApplicationUsersService class uses the access token from the token service. This is a scoped service and the data is requested from the API using a bearer token in the authorization header.

using IdentityModel.Client; namespace Ui; public class ApplicationUsersService { private readonly IConfiguration _configuration; private readonly IHttpClientFactory _clientFactory; private readonly ApplicationAccessTokenCache _apiTokenCacheClient; public ApplicationUsersService(IConfiguration configuration, IHttpClientFactory clientFactory, ApplicationAccessTokenCache apiTokenCacheClient) { _configuration = configuration; _clientFactory = clientFactory; _apiTokenCacheClient = apiTokenCacheClient; } /// <summary> /// HttpContext is used to get the access token and it is passed as a parameter /// </summary> public async Task<string> GetPhotoAsync() { try { var client = _clientFactory.CreateClient(); client.BaseAddress = new Uri(_configuration["AuthConfigurations:ProtectedApiUrl"]!); var access_token = await _apiTokenCacheClient.GetApiToken( "CC", "myccscope", "cc_secret" ); client.SetBearerToken(access_token); var response = await client.GetAsync("api/ApplicationUsers/photo"); if (response.IsSuccessStatusCode) { var data = await response.Content.ReadAsStringAsync(); if (data != null) return data; return string.Empty; } throw new ApplicationException($"Status code: {response.StatusCode}, Error: {response.ReasonPhrase}"); } catch (Exception e) { throw new ApplicationException($"Exception {e}"); } } }

The required services are added the the application in the program file.

builder.Services.AddSingleton<ApplicationAccessTokenCache>(); builder.Services.AddScoped<ApplicationUsersService>(); builder.Services.AddHttpClient(); builder.Services.AddDistributedMemoryCache();

The token cache works great when using in-memory cache. If using a persistent cache, care needs to be taken that the access tokens are persisted in a safe way.

Notes

In follow up blogs, I will look at the different ways and the different types of strategies which are used to implement token management in ASP.NET Core web applications.

Microsoft.Identity.Web delegated access tokens & OBO access tokens Microsoft.Identity.Client application tokens Azure SDK tokens Handling multiple access tokens OAuth Token Exchange for downstream user delegated access tokens Links

https://learn.microsoft.com/en-us/aspnet/core/security/authentication/social/additional-claims

https://github.com/dotnet/aspnetcore/issues/8175

Sunday, 19. January 2025

Doc Searls Weblog

The Blame Game

Twenty-third in the News Commons series Seventh on the #LAfires Disaster coverage tends to go through four stages: Live reporting. TV stations stop all advertising and go into round-the-clock coverage. Radio stations drop the feeds from elsewhere and go wall-to-wall with live reports. Newspapers drop their paywalls. Coverage by the best of them ranges from […]

Twenty-third in the News Commons series
Seventh on the #LAfires

Images from a Google image search for “LA fires aftermath.” Sources listed below.†

Disaster coverage tends to go through four stages:

Live reporting. TV stations stop all advertising and go into round-the-clock coverage. Radio stations drop the feeds from elsewhere and go wall-to-wall with live reports. Newspapers drop their paywalls. Coverage by the best of them ranges from good to heroic. As stories go, they are all about the problem. Human interest. The worst is over, so the stories go to human loss, resourcefulness, forbearing, kindness, and mendacity. Reporters on TV stand in front of some hellscape and talk to victims, heroes, or anyone they can find. This leverages the need for characters and their problems, with stories about how they survived, who or what they lost, always at least with hints toward movement forward. Blame. This is all about character. Reporters are taught to seek the five W’s: Who, What, Where, When, and Why, sometimes appending an H for How. But since stories always require characters, Who is what matters most. Coming to an understanding. While that’s the goal, and something nearly everyone wants, it’s provisional while the story continues. Also, full understanding, if it ever comes, will be at the end. And stories like the LA fires may never end, because there are contexts for them that may never change.

But that’s where we are now. In Stage 4. Here’s a Google search I just did:

Note that all of these stories are one to three days old. All of them are provisionally conclusive (meaning they still keep the story going). And all of them are less interesting than Stage 3: Blame.

I want to focus on that because blame makes story-telling—and story-generating—easy. Too easy.

That’s why Donald Trump—the best story-generator (and blame-placer) the world has ever known (and I mean this seriously)—wasted no time fingering Gavin Newsom:

Politifact spent 942 words explaining, among other things, that there was plenty of water, that a “water restoration declaration” never existed, that the smelt thing concerned water for elsewhere, and that what Trump said was wrong in other ways. But, as Ronald Reagan said, If you’re explaining, you’re losing. And Trump never loses, because that’s out of character. (Again, I mean this seriously. We’re talking stories here. The man is all-character, all the time.) And it is impossible for the media, which now includes everybody, not to participate in his story processes.

Of course, stories being what they are, Trump has hardly been alone in playing the blame game. Scott Galloway:

LAFD Chief Kristin Crowley took time out from fighting the fires to blame Mayor Karen Bass for cutting $17.6 million from the department’s budget. The department’s $895 million budget actually increased by $58 million as a result of the latest union contract. To get that higher pay to firefighters, the department reduced spending on equipment and training. The mayor’s response added to the drama, fueling rumors that the chief had been fired. Right-wing podcasters blame Fire Chief Crowley’s gayness, saying she was a DEI hire. Note: She has outstanding qualifications.

For all the good work being done at Stage 4, the urge to blame will never cease.

And we need some. Justice demands it.

What kind of justice? Well, there are many: retributive, distributive, restorative, procedural, social, environmental, economic, and so on. Of all those, retributive justice is what people want most and best understand. Perpetrators must be found, punished, and made to pay.

But starting with blame is too easy, too simple, often wrong, and always incomplete. Other factors matter. Conditions. Contexts. Black swans.

That’s why coming to an understanding is far more important, especially if we want to minimize or avoid the same kinds of disasters in the future. That’s what I’ll visit in my next series of posts.

Meanwhile, some of my open tabs:

LA Times: How to help those affected by fires raging across Los Angeles County. KNBC/4: Asher Bingham, an LA artist, is drawing pictures of houses lost in the fire. “What I’m fascinated with is taking something from a picture and building a story to it,” she tells “So Asher picked up a pen and shared the artwork to Instagram with a simple offer to anyone that has lost a home in the LA fires. ‘I will draw your home for free… Three million views on that video later, there is now a waiting list of hundreds of requests.” Wired: Los Angeles Needs to Fireproof Communities, Not Just Houses. Business Insider via MSN: I’m a pilot flying a DC-10 fire bomber to battle the LA fires. This is what I’ve seen — and this is what needs to stop. Kwaai Events: Fire Resilience & Beyond with Personal AI Hackathon 2025 – $30K in Prizes. Washington Post: A key ‘weakness’ in L.A.’s wildfire strategy went unaddressed for years, Post probe shows.

† Sources via Google Image search: Reuters, NBC News, Marketwatch, ABC News, Hindustan Times, ABC, NPR, USA Today, The Indian Express, Nevada Appeal, YouTube, Commercial Observer, East Bay times, CBC, India TV News, Urban Land Magazine, Hope Standard, TMZ, Yahoo, Bloomberg, Instagram, and iObserve.

Saturday, 18. January 2025

Wrench in the Gears

Trump Dance, Oscillation, And Going “Inside”

In this morning’s video I somehow missed the fact that the inauguration coincides with MLK Day, which is an interesting sync with the “melanin as optical material” lens. I’m going to drop a few links to old blog posts that touch on the YMCA and “out of school time learning” as as well as my [...]

In this morning’s video I somehow missed the fact that the inauguration coincides with MLK Day, which is an interesting sync with the “melanin as optical material” lens. I’m going to drop a few links to old blog posts that touch on the YMCA and “out of school time learning” as as well as my “Momo” read-aloud playlist. Maybe I should do a read aloud of Bentov’s “Stalking the Wild Pendulum,” too? I’m understanding it much better now. Look for a discussion of quantum realism with Joseph Gonzales in the coming week. I’ll just say it once again, I can’t believe I have to talk about any of this, but if you don’t take it too seriously, conjecturing can be rather fun.

https://x.com/MAHAAction/status/1879945173721051166

Going “inside”

Momo Read Aloud Playlist https://wrenchinthegears.com/2017/05/08/out-of-school-time-learning-a-wolf-in-sheeps-clothing/ https://wrenchinthegears.com/2018/03/18/too-big-to-map-but-i-tried/ https://wrenchinthegears.com/2019/11/15/the-family-friendly-schools-act-a-set-up-for-soft-policing-schools-to-profit-impact-investors/ https://wrenchinthegears.com/2023/02/27/gods-eye-view-part-5-prediction-markets-in-public-policy/ https://wrenchinthegears.com/2023/02/05/dallas-esports-and-bucky-ballers-guest-post-and-follow-up-discussion/ https://ymcadallas.org/programs/youth-sports/esports

 

 

 

 

 

 

 

 

Friday, 17. January 2025

Wrench in the Gears

Gestalt, Quakers and Carbon Trading – Do You See Faces Or A Vase?

In this video I draw connections between Gestalt theory and organizational development as applied to digital groupthink in the context of Christiana Figueres, a leader in the development of carbon trading, who trained at the Gestalt Institute of Cleveland. As a young woman, she attended Swarthmore College, a Quaker institution. The first wife of her [...]

In this video I draw connections between Gestalt theory and organizational development as applied to digital groupthink in the context of Christiana Figueres, a leader in the development of carbon trading, who trained at the Gestalt Institute of Cleveland. As a young woman, she attended Swarthmore College, a Quaker institution. The first wife of her father, three-time president of Costa Rica and MIT alumnus, had ties to Quaker expats (Henrietta Boggs’ uncle and aunt) who emigrated south from Alabama in the 1950s.

Kenneth Boulding, a Quaker economist who developed some of the first ideas around social impact investing, and his wife Elise, sociologist who built the field of peace studies, were affiliated with Swarthmore though Kenneth only did a year’s stint as a guest lecturer.

When our family toured the campus for a college visit, it was made clear that only the top-tier candidates would be accepted, because many slots were reserved for equity outreach. Interestingly a high percentage of students presenting that day asserted their intentions to become economists with aspirations of working at the World Bank and the IMF. That was an eye-opener for sure! 

https://prabook.com/web/christiana.figueres/451403 https://www.swarthmore.edu/news-events/un-climate-chief-christiana-figueres-79-among-most-influential-people-2016 https://acrcarbon.org/news/winrock-international-applauds-board-member-christiana-figueres-on-united-nations-climate-change-secretariat-appointment/ https://www.ecosystemmarketplace.com/articles/christiana-figueres-close-personal-carbon-markets/ https://embed.kumu.io/a257abdf4aa4cbf3f75b90154c4f5099#untitled-map?s=bm9kZS1ycmxMMHJvNw%3D%3D https://www.neh.gov/humanities/2014/januaryfebruary/statement/first-lady-the-revolution https://www.quakersintheworld.org/quakers-in-action/257/Quakers-in-Costa-Rica https://alum.mit.edu/slice/mits-costa-rican-president

I use the visual of the optical illusion of paired silhouettes with a vase in the center to describe how two narratives can be presented, but often only one is truly “seen.” This Gestalt concept is applied to parallel narratives – one presented in two recent stories on Unlimited Hangout that foreground militarism and data surveillance in AI healthcare and the story I keep attempting to tell about gamified managed community “wellness” protocols linked to social impact finance deals on blockchain that on another level will be leveraged to catalyze emergent behaviors and possibly advance bio-hybrid information processing.

I also describe a strange encounter in which Aetna health care intended to target low-income communities of color for VR guided meditation in a digital “jungle” setting. The second hour of the video is a repeat of two site visits Jason and made in Denver in 2022 to the then new Palantir headquarters and the state capitol.

 

Thursday, 16. January 2025

The Pragmatic Engineer

The Pulse #120: What does Salesforce’s software engineering hiring freeze mean?

Also: how an hour-long GitHub outage is a reminder of the real cost of reliability, what Cloudflare’s recruiting statistics say about the job market, and more.

The Pulse is a series covering insights, patterns, and trends within Big Tech and startups. Notice an interesting event or trend? Send me a message.

Today, we cover:

Industry pulse. TikTok’s US sell-or-shut-down deadline nears, CEO fired over disastrous Sonos software update, unusual use of AI in elite tennis, NVIDIA’s “personal” AI computer, Meta cuts DEI programs, and more.

What does Salesforce’s software engineering hiring freeze mean? Amusing: between the two highest-paid corporate functions (sales and engineering) the one getting a hefty headcount increase is sales, which cannot show its productivity improved by GenAI tools.

GitHub outage: the real cost of reliability. Turns out, an hour-long outage per quarter is well within 99.9% reliability. GitHub offers this SLA for its customers as default, while GitLab Dedicated promises 99.5% (which allows for up to 10 hours/quarter in downtime.) It will cost you if you need more reliability.

What Cloudflare’s recruiting statistics say about the job market. There were about 6x more applicants for every open position at Clouflare in 2024, than in 2021. 2021-2024, Cloudflare has consistently hired about the same number of people.

How the commercial sector and academia regained computing lead. From the 1940s to the 1990s, software engineering breakthroughs were frequently coming from work done within governmental organizations. Since then, this has changed, with Big Tech, startups and academia taking a lead in pushing innovation ahead – at least for now.

1. Industry pulse TikTok’s US sell-or-shut-down deadline nears

TikTok is one of the biggest social media platforms, with around 1.5 billion users globally and circa 121 million in the US, many of them younger in age. It’s owned and operated by Bytedance, which is based in China, and there are security concerns relating to this: India banned TikTok in 2020, while the US, Canada, EU and UK ban the app from government devices.

The US has taken a step in India’s direction: in April 2024, it passed a bill that requires TikTok be sold to a US company in order to bring it within US jurisdiction, or be banned from the US. The deadline for this is close: Sunday, 19 January. At the time of publication no sale had been announced, but despite the absence of a buyer so far, TikTok is an extremely attractive acquisition, with one in three US citizens using it. So far, speculation about potential buyers has focused on:

Elon Musk: the owner of X, a much smaller social network than TikTok. As one of the world’s richest people, and with his growing political influence in the US as part of the incoming Trump administration, Musk could be a serious bidder.

Other individuals could pull syndicates together to buy TikTok:

The most-subscribed YouTuber, Mr Beast

Former US Treasury secretary, Steven Mnuchin

A Shark Tank investor, Kevin O'Leary

Activision’s former CEO, Bobby Kotick

It’s curious that no large tech companies like Meta, Snap, Microsoft, and Google, have been mooted as potential buyers, and could be due to concerns about antitrust regulation.

A likely outcome is that TikTok sells to the highest bidder, whoever that is; although the video-sharing app might have a trick up its sleeve: spook US lawmakers into backing out of a ban.

The Information reports Tiktok plans to shut down the app on Sunday, in a move almost guaranteed to trigger a sizable reaction by its 100M+ US users. On the principle that politicians respond to pressure, the strategy seems to be to generate a noisy response by US voters that causes politicians to change course away from a ban, out of fear of being punished at the ballot box in future elections by disgruntled TikTok users who blame elected officials for the popular app being unavailable.

To be fair, TikTok has little to lose and a lot to gain: in the worst case, it sells to the highest bidder, and in the best case, it gets more time to operate in the US, and perhaps to increase the value of offers on the table for it.

CEO fired over disastrous Sonos software update

Read more

Wednesday, 15. January 2025

The Pragmatic Engineer

“The Coding Machine” at Meta with Michael Novati

In today’s episode, I’m joined by Michael Novati, Co-founder and CTO of Formation. Michael spent eight years at Meta, where he was recognized as the top code committer company-wide for several years.

Before we start: as an experiment, below the takeaways, I’m including a more detailed, bullet-point summary in this issue. This is an experiment: please let me know how you like it!

Stream the Latest Episode

Available now on YouTube, Apple and Spotify. See the episode transcript at the top of this page.

Brought to You By

Vanta — Automate compliance and simplify security with Vanta.

WorkOS — The modern identity platform for B2B SaaS.

In This Episode

In today’s episode of The Pragmatic Engineer, I’m joined by Michael Novati, Co-founder and CTO of Formation. Before launching Formation, Michael spent eight years at Meta, where he was recognized as the top code committer company-wide for several years. The “Coding Machine” archetype was modeled after Michael at the company.

In our conversation, we talk about what it was like working at Meta and dive into its engineering culture. Michael shares his journey of quickly climbing the ranks from intern to principal-level and gives level-headed advice on leveling up your career. Plus, we discuss his work at Formation, where he helps engineers grow and land roles at top tech companies.

In this episode, we cover:

An overview of software architect archetypes at Meta, including “the coding machine”

Meta’s org structure, levels of engineers, and career trajectories

The importance of maintaining a ‘brag list’ to showcase your achievements and impact

Meta’s engineering culture and focus on building internal tools

How beating Mark Zuckerberg in a game of Risk led to him accepting Michael’s friend request

An inside look at Meta’s hiring process

Tips for software engineers on the job market on how to do better in technical interviews

And more!

Takeaways

Here are my biggest takeaways from this episode:

1. The importance of archetypes at Meta. Archetypes are initially modelled after existing engineers at Meta, and they serve two main roles:

Fairness: offer a way for the company to “patter match” upcoming engineers against these personas, and have a fair system to determine who is at this higher level, and who is not

Career path: offer a non-manager career path that goes beyond the Staff engineer (E6) level. Before archetypes were a thing, it was unclear how to get promoted to E7 and E8 and above levels – where E8 is the equivalent of Director-level (D1) roles

Archetypes are ingrained in Meta’s engineering culture and are a major differentiator compared to other Big Tech companies that lack such nuanced differentiators at the Staff+ engineering levels.

2. There’s a limit on how much influence an IC can have, even at Meta. Despite offering IC career paths that are better-defined at the Staff+ levels than most other large tech companies: at the Principal and above engineering level, there are still more directors than engineers. Michael used to think this is unfair – but, over time, he realized why this is. As he put it:

“Even if you write 10x the code or 100x the code, you cannot replace 3,000 engineers with one engineer. So you can be a VP of engineering, though, overseeing 3,000 engineers. No matter how you multiply it out: even if you are just are the manager of 10 ‘superstar’ engineers, you still have more influence/impact over the direction of those people in the company.”

It’s helpful to understand the realistic and necessary limitations of the individual contributor path in terms of influence, within a large organization.

3. Tenure can become surprisingly important at a fast-growing scaleup. Michael recalled how when he became an E7 (the equivalent of a Principal Engineer at other, similar companies) – he became a part of a tightly knit group of E7+ engineers. Here, the cultural divide between those that had been at Meta for a long time – and promoted into this position – and those hired externally was strong.

Many of the external hires struggled to make the kind of impact that tenured E7+ engineers could, and lots of external hires ended up leaving the company relatively quickly.

Michael observed this during the earlier phase of Facebook/Meta, when it was growing very quickly. While the culture might have changed: this highlights how challenging it can be to “pick up” the culture of a fast-growing scaleup from outside, and how being with the company for a longer time can help you get more stuff done efficiently – and thus also grow faster in your career.

4. Causing an outage at a fast-moving scaleup is not the end of the world. Michael shared an amusing outage when he accidentally overloaded a node (a virtual machine) thanks to shipping a prototype version of a photo mask on Facebook profiles, to add support for a cause. The feature blew up a few days after setting it live, because it could not handle a node being written millions of times per hour, to update two-way graph nodes between a user’s profile and this image template. On top of this, this failure caused cascading failures.

Rolling back this change was not an option. In this case, the database infrastructure team stepped in; made the edge one-way (rather than two-way) and thus removed the write bottleneck.

Michael learned an important lesson: and in the end, it was still a net win for Facebook to realize that this feature is wildly popular a day or two after the launch. Spending a lot of time building a feature that might never get used would have been a worse investment – during this early growth stage at Facebook, that is!

5. Interview processes have not changed much over the last decade! Michael coaches engineers to prepare for interviews, so he has first-hand experience in this. With his words:

“The interview styles haven't changed since before Leetcode, and since after Leetcode. It’s the style that has been around. Facebook didn't invent these interviews: it borrowed a bit from Yahoo, Microsoft and Google. Google also borrowed from other companies at the time.

These days, we see a lot of AI companies, have daily “Facebook-like” processes and vibes: such as OpenAI.

The interviews are the ‘Leetcode interviews’ where they test language-agonostic problem-solving skills.

It’s always been the point to test for these kinds of problems: because it’s what engineers do! Solve problems, regardless of what specific tech stack or programming language you use.”

A consequence of the interview process not changing much, but the job market becoming more competitive is how the bar to do well on these interviews went up. This is because there are more and better preparation materials, so the “average” candidate does better on these interviews than years before. Preparing for interviews at Big Tech companies and scaleups is no longer a “nice to have:” it’s a necessity for even strong engineers, who want to get a job offer.

A summary of the conversation

For those of you more interested in reading a summary of the conversation, see it here. This is an experiment — please leave a comment on how you find this addition!

From intern to E7 in 6 years

Michael joined Meta (then Facebook) as an intern and, remarkably, reached the E7 level (equivalent to principal engineer) in just six years. This rapid career progression is unusual, as a path like this would typically take at least a decade.

His relationship with his managers was critical, built on mutual trust and transparency. His managers knew he was passionate and sometimes impulsive but trusted his judgement. Michael also felt that he could help his managers in their jobs. He was receptive to direct feedback, allowing him to address issues quickly.

He maintained a "notepad" of his accomplishments, noting down significant fixes, bugs, or other contributions. This helped him summarise his work and make sure he was hitting the requirements of the next level, and he would review these with his manager.

From his first days as an intern, Michael demonstrated his coding skills. On his second or third day, he noticed the company's org chart tool was clunky and inefficient. Without asking for permission, he rewrote the tool, creating a horizontal layout and shipping it. This was very well received by colleagues.

As a product engineer, Michael worked on various teams, including internal tools, Facebook Groups, News Feed, Facebook Workplace and Messenger for Kids. He spent about 30% of his time working on his assigned product teams as a senior engineer would. The remaining 70% of his time was spent on large-scale refactoring, code cleanups, and company-wide projects.

Michael became known as the "coding machine" at Meta. The company developed this archetype, in part, to describe Michael's unique impact.

The "coding machine" archetype is for engineers who can move projects forward, unblock other people, refactor code quickly, and help launch products that may typically require a team of engineers.

The archetype was created after comparing him to other engineers at the E7 level, focusing on the overall impact he was making, which was similar to other E7s but not within the existing archetypes, such as "fixer".

While anyone can write a lot of code, what makes a "coding machine" is the impact the code has. This impact is measured by how much it moves projects forward, helps launch products, unblocks people, and speeds up refactoring.

The "coding machine" archetype was championed by an executive, Tom Allison, who helped connect the dots to make the archetype a reality.

Michael explains that at Meta, engineers are compared to professional sports teams. While everyone at a high level is good at basic tasks, people have specialities. Michael's was moving prototypes forward and refactoring code really fast.

Meta’s engineering culture

Meta has an engineering-first culture, where individual contributors are highly valued and empowered. The company wanted to create career paths for talented individual contributors, so that they did not have to become managers to progress.

Internal tools at Meta are treated as products. They were built with the same code base as user-facing tools. This made the internal tools team one of the most exciting to work on because engineers could build product at a much faster pace.

Meta built most of their infrastructure from scratch which resulted in custom tools. Because internal tools were seen as products, it gave the company an engineering product ecosystem.

Michael's intern project was an internal meeting scheduling tool, designed to quickly find meeting times and rooms for groups of people.

Michael recalls that Meta had custom version control, code review and build tools. While the company used SVN as a backbone, they used Mercurial locally on people's machines to manage their local branches. The company chose Mercurial because it was easier to work with the open-source team to hack into it, which aligned with Meta’s culture.

Many internal tools that Meta created have seeded other companies. Examples include Statsig (experimentation platform) Honeycomb (observability.)

The values of moving fast, breaking things and being bold were all reinforced at Meta. If you moved really fast and broke something, you would not get fired. If you were making a bold bet and pushing limits that was also rewarded, even if it didn't work out.

Michael shared a story about how he became Facebook friends with Mark Zuckerberg. During a game of Risk, he formed an alliance with Mark, only to betray him later to win the game. Despite this – or perhaps because of it! –, Mark accepted his friend request that had been pending for some time by then.

At Meta, product reviews are a regular part of the development cycle, also known as "Zuck Reviews". These 15-minute presentations allowed Mark Zuckerberg to give direction and feedback on products. He asks a lot of detail-focused questions to figure out the exact details and make sure that the best possible product was being built.

Michael caused quite the outage, one time. A prototype feature he built allowed users to overlay a photo template on their profile picture. When a large number of users used the French flag template, the system crashed because Michael designed the database to have two way edges for the prototype. One of the nodes got overloaded causing cascading effects. While a two-way edge was not recommended to use, Michael explains he made this decision to simplify things for a prototype.

The interview process at Meta, during Michael’s time

The interview process at Meta typically starts with a recruiter screen, followed by a technical screen (a 45-minute coding interview, usually with two questions). If that goes well, candidates then attend an on-site interview, which includes two more coding interviews, a behavioral interview, and a systems design interview.

During Michael’s time, these interviews had the names “Jedi”, “Pirate” and “Ninja”.

Meta’s technical interviews are distinct because they are whiteboarding style with almost no small talk. Interviewers jump into the technical questions and expect candidates to walk through a clear problem-solving process, without compiling the code to check if it works.

After the onsite interview, there is a debrief where interviewers share feedback. If there are no red flags, then the candidate goes to the hiring committee.

The hiring committee consisted of a quorum of at least three director or VP-level engineering leaders. A recruiter presents a packet about the candidate. The default at this stage is that a candidate is likely to be hired, so focus is typically on red flags.

The packet contains detailed information, including feedback from the interviewers, the interviewer's history, questions asked, and how many times the questions have been asked. This helps the directors calibrate and interpret the feedback.

The hiring committee looks for flags and inconsistencies but the most common decision point was determining the candidate's level. Michael made it to the meetings because he wanted to ensure that Facebook did not lower its hiring bar as it scaled.

Michael notes that his time in the hiring committee has been helpful in his current business, coaching people, and that he can offer a different point of view to engineers trying to undestand how it all works.

Advice for software engineers to grow professionally

Michael advises that finding the right job for the right alignment is more important than just checking the boxes and passing an interview.

He notes that the interview processes at many top tech companies are similar, originating from Yahoo, Microsoft and Google and the style has been consistent. This style focuses on testing language and stack-agnostic problem-solving skills.

Michael compares interview preparation to going to a personal trainer. He advises that engineers must get back in shape to prepare for interview processes, regardless of how much experience they have.

The job market for software engineers has changed and is more competitive. There are now more steps in the process. Companies are using online assessments and implementing team matching.

Michael's most productive year, he made thousands of code commits (diffs). While at Meta, most of his code was in Hack, a version of PHP. He now primarily codes in Javascript. His favourite language now is Typescript.

Michael advises that storytelling is a key way to communicate, influence and share as humans. He recommends the book The Histories by Herodotus.

The Pragmatic Engineer deepdives relevant for this episode

Inside Meta’s engineering culture

Stacked diffs (and why you should know about them)

Engineering career paths at Big Tech and scaleups

Inside the story of how Meta built the Threads app

Timestamps

(00:00) Intro

(01:45) An explanation of archetypes at Meta, including “the coding machine”

(09:14) The organizational structure and levels of software engineers at Meta

(10:05) Michael’s first project re-writing the org chart as an intern at Meta

(12:42) A brief overview of Michael’s work at Meta

(15:29) Meta’s engineering first culture and how Michael pushed for even more for ICs

(20:03) How tenure at Meta correlated with impact

(23:47) How Michael rose through the ranks at Meta so quickly

(29:30) The engineering culture at Meta, including how they value internal tools

(34:00) Companies that began at Meta or founded by former employees

(36:11) Facebook’s internal tool for scheduling meetings

(37:45) The product problems that came with scaling Facebook

(39:25) How Michael became Facebook friends with Mark Zuckerberg

(42:05) The “Zuck review” process

(44:30) How the French attacks crashed Michael’s photo inlay prototype

(51:15) How the photo inlay bug was fixed

(52:58) Meta’s hiring process

(1:03:40) Insights from Michael’s work at Formation

(1:09:08) Michael’s advice for experienced engineers currently searching for a job

(1:11:15) Rapid fire round

Resources & Mentions

Where to find Michael Novati:

• X: https://x.com/michaelnovati

• LinkedIn: https://www.linkedin.com/in/michaelnovati/

• Facebook: https://www.facebook.com/mn/

Mentions during the episode:

• Software Architect Archetypes: https://newsletter.pragmaticengineer.com/p/software-architect-archetypes

• Formation: https://formation.dev/

• Get your work recognized: write a brag document: https://jvns.ca/blog/brag-documents/

• A Work Log Template for Software Engineers: https://blog.pragmaticengineer.com/work-log-template-for-software-engineers/

• GitHub: https://github.com/

• Mercurial: https://www.mercurial-scm.org/

• Statsig: https://statsig.com/

• Sentry: https://sentry.io/welcome/

• Graphite: https://graphite.dev/

• Mark Zuckerberg at Startup School 2013:

• Mark Zuckerberg at Startup School 2012:

• Risk board game: https://en.wikipedia.org/wiki/Risk_(game)

• Wecode: https://wecode.io/en/

• CodeSignal: https://codesignal.com/

• HackerRank: https://www.hackerrank.com/

• Hack: https://engineering.fb.com/2014/03/20/developer-tools/hack-a-new-programming-language-for-hhvm/

• Javascript: https://www.javascript.com/

• Typescript: https://www.typescriptlang.org/

• The Histories: https://www.amazon.com/Histories-Herodotus/dp/0140449086

Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com.


Damien Bod

ASP.NET Core user delegated access token management

The article looks at managing user delegated access tokens for a downstream API in an ASP.NET Core web application. There are many ways of implementing this, all with advantages and disadvantages. The tokens are requested from an OpenID Connect server using the recommended standards. In this blog, the UI access token from the application authentication […]

The article looks at managing user delegated access tokens for a downstream API in an ASP.NET Core web application. There are many ways of implementing this, all with advantages and disadvantages. The tokens are requested from an OpenID Connect server using the recommended standards. In this blog, the UI access token from the application authentication is re-used for the downstream API.

Code: https://github.com/damienbod/token-mgmt-ui-delegated-direct

Blogs in this series ASP.NET Core user delegated access token management ASP.NET Core user application access token management ASP.NET Core delegated OAuth Token Exchange access token management Setup

The solution uses a secure downstream API and requires user delegated access tokens for access. The UI application is implemented using ASP.NET Core and razor pages. The UI application authenticates against an OpenID Connect server implemented using OpenIddict. The application client is implemented using an OpenID Connect confidential client and the OpenID Connect code flow with PKCE. This flow returns an access token which is used for the downstream API.

What must an application manage?

An access token management solution must ensure that tokens are securely stored per user session for user delegated tokens and updated after each UI authentication or refresh. The solution should be robust to handle token expiration, function seamlessly after restarts, and support multi-instance deployments. Additionally, it must effectively manage scenarios involving invalid or missing access tokens.

Properties of token management in the solution setup: The access token is persisted per user session The token expires The token needs to be persisted somewhere safely The token must be replaced after each UI authentication (per user) The solution must work after restarts The solution must work for multiple instances when deployed to multi-instance deployments. The solution must handle invalid access tokens or missing access tokens Implementation example using Duende Token Management

The Duende Token Management OpenIdConnect Nuget package can be used to implement these requirements for user delegated access tokens. This is not the only way to do this and can be implemented in numerous different ways. With this package, the AddOpenIdConnectAccessTokenManagement can be used to add the token management.

builder.Services.AddOpenIdConnectAccessTokenManagement();

The different HTTP requests or the Razor pages can use the GetUserAccessTokenAsync method to get the access token for each user and session. The SaveTokens is set to true to persist the tokens in the session using a secure cookie per default. This is an option in the AddOpenIdConnect method. The defined scope is used to select the access token.

// token expires, token managment required var accessToken = await HttpContext.GetUserAccessTokenAsync( new UserTokenRequestParameters { Scope = "myscope" });

The AddUserAccessTokenHttpClient method can be used to add a named HttpClient client for the access token and the specific API.

builder.Services.AddUserAccessTokenHttpClient("profileClient", configureClient: client => { client.BaseAddress = new Uri(profileApiBaseUrl!); });

This client can then be used to access the API resource using the access token.

var client = _clientFactory.CreateClient("profileClient"); var response = await client.GetAsync("api/Profiles/photo"); if (response.IsSuccessStatusCode) { var data = await response.Content.ReadAsStringAsync(); if (data != null) return data; return string.Empty; }

When the applications are started, the access token can be used to request data from the downstream API.

Notes

In follow up blogs, I will look at the different ways and the different types of strategies which are used to implement token management in ASP.NET Core web applications.

OAuth Client credentials tokens in web applications Microsoft.Identity.Web delegated access tokens & OBO access tokens Microsoft.Identity.Client application tokens Azure SDK tokens Handling multiple access tokens OAuth Token Exchange for downstream user delegated access tokens Links

Duende.AccessTokenManagement.OpenIdConnect

Duende token management

https://learn.microsoft.com/en-us/aspnet/core/security/authentication/social/additional-claims

https://github.com/dotnet/aspnetcore/issues/8175

Tuesday, 14. January 2025

Jon Udell

The Configuration Crisis

Perhaps, even though they are not themselves explainable, AIs can help us engineer explainable systems. But I’m not optimistic. It feels like we’re on a path to keep making systems harder for humans to configure, and we keep expanding our reliance on superhuman intelligence to do that for us. The Configuration Crisis and Developer Dependency … Continue reading The Configuration Crisis

Perhaps, even though they are not themselves explainable, AIs can help us engineer explainable systems. But I’m not optimistic. It feels like we’re on a path to keep making systems harder for humans to configure, and we keep expanding our reliance on superhuman intelligence to do that for us.

The Configuration Crisis and Developer Dependency on AI

LLM series at The New Stack


Wrench in the Gears

The Day An Evangelical Christian Convinced ChatGPT I Was A Hippy, Schwab-Aligned Sociopathic Magician

What will come of large language models when they mirror unsettled psyches defined by religious hero complexes into the cloud? We do ourselves and future generations a profound disservice when we choose not to make time to comprehend and interrogate how emerging technologies and narrative flows are being deployed to steer psychological states individually and [...]

What will come of large language models when they mirror unsettled psyches defined by religious hero complexes into the cloud? We do ourselves and future generations a profound disservice when we choose not to make time to comprehend and interrogate how emerging technologies and narrative flows are being deployed to steer psychological states individually and as holon-based collectives through socio-technical, cyberphysical systems and the tokenized attention economy.

I recognize that rather than being a victim of Substack mania once again, I was merely a reflection of another’s discomfort in the world, an obstacle that gave someone a useful anchor point to exert his will and express his agency, however uninformed and misguided. It’s not a good feeling, but an informative lesson nonetheless.

It’s time to wrap our heads around tokenized spirituality and organized religion. My gut tells me Texas is ground zero for integration of the Drucker-Buford metaverse church model with digital community currencies tied to “moral” markets and social impact “charity” in ATT smart cities designed for community “wellness.” 

I don’t think AI should be approached as a digital intercessor for “divine” moral guidance.” I view, web3 systems as linguistic scaffolds quietly mapping the terrain of our souls. Once harvested, our somatic pyschological GPS coordinates can be used ad infinitum to play us like instruments in the coming LARPy animist musical of blockchained cloud minds to solve gnarly problems that can only be untangled by a network of diverse, distributed intelligences – with digital wallets.

Follow your heart, think clearly, and ground yourself. It is easy to get swept away by the Sturm und Drang we generate around ourselves to make the hero roles we inhabit make sense. I’m not going to link to Andrew Corner’s Substack directly, but it’s not hard to find – I’m in the “energy” one. If you go there, fair warning, it’s a slog that’s equal parts disturbing and amusing.

What are we becoming folks? I don’t think the “hybrid” intelligence program is going to be a good look for many of us.

 

At the end of the day this may very well have simply been another signals-intelligence operation. It doesn’t seem like anyone really pays attention to that Substack anyway. If so, I hope there are at least a few useful take aways regarding how we should engage with this systems.

 

https://apparentselves.org/team/pranab-das/

My Santa Barbara map with Alice Bailey.

https://embed.kumu.io/0b2d7e56dffb9dace92cd6f7f09e4a3e#untitled-map?s=bm9kZS05VWJrdHREQg%3D%3D

My Peter Drucker / Bob Buford map of the origins of the Christian mega-church movement.

https://embed.kumu.io/34b47105cbce2883a7cd6639d207e509#untitled-map?s=bm9kZS1DaXRiWmxsMg%3D%3D

My site visits to the Templeton Foundation in Coshohocken, PA – “Money Will Teach You How To Pray”


Doc Searls Weblog

How Facts Matter

Twenty-second in the News Commons series Sixth on the #LAfires Facts don’t matter, or they matter much less than people think—Daniel Kahnemann Facts don’t matter. What matters is how much we hate the person talking—Scott Adams But facts do matter when life and death are on the line. Or when one is recovering from a […]

Twenty-second in the News Commons series
Sixth on the #LAfires

Facts don’t matter, or they matter much less than people think—Daniel Kahnemann

Facts don’t matter. What matters is how much we hate the person talking—Scott Adams

But facts do matter when life and death are on the line. Or when one is recovering from a life-changing situation. To help make the case for that, here are two collections of facts about the #LAfires:

A useful visualization of the CAL FIRE Damage Inspection (DINS) database for the Eaton Fire. A useful visualization of the CAL FIRE Damage Inspection (DINS) database for the Palisades fire.

These are being updated on an ASAP basis by CAL FIRE (the The Department of Forestry and Fire Protection) as they inspect damage across the those two fires’ burned neighborhoods.

Here is a close-up of one section of the Eaton Fire map, with one moused-over pop-out:

The photo link in the pop-out goes here:

The house next door appears undamaged. That’s the green one on the map. By looking at Google Maps, I can see those houses are 3305 and 3310 North Marengo Avenue in Altadena. Note that the map has a search field in the top left corner. When one looks for an address (such as the first of those two), a pop-out appears with this:

“Show more results” and “Zoom to” lead to additional information. Would it help to have a list of all the affected structures, their addresses and conditions? Sure. Exactly seven years ago, after a massive debris flow destroyed much of Montecito, California (one zip code away from our house in Santa Barbara), I posted a list like that in this blog post. My source map was this one (now at the Santa Barbara County Fire Department):

Mousing over those dots popped up an address for each one, and I used those to hand-produce a master list with the addresses for all the structures that had been lost or damaged. The idea was to have a Web page where people could quickly see what may have happened to their house—or the house of somebody they knew, using a simple text search of a Web page. Maybe somebody will do the same thing here.

I bring this up in obedience to my mother’s prime command to my sister and me when we were kids: Make yourself useful.

And that’s what I hope I’m also doing with this post. It’s not just about sharing some maps. It’s about pulling facts away from stories, which I wrote about in my last post. To spare you a click, I said stories are the base format of human interest, and the main product of journalism—and that they have just three requirements: character, problem, and movement.

Stories serve many purposes, of course. One is persuasion. For more on that, check out Win Bigly: Persuasion in a World Where Facts Don’t Matter, by Scott Adams, of Dilbert fame:

The world where facts don’t matter is the digital one we live in now, alongside the physical one where all those homes were lost. In the digital world, the three requirements of stories can be used against journalism as well as by it. Toward both of those uses, I wrote Where Journalism Fails during the first Trump administration, and some of it bears repeating:

We have never seen Donald Trump’s like before, and may never again. His genius at working all three elements is without equal in our time—or perhaps any time. So please, if you can, set your politics aside and just look at the dude through the prism of Story.

Donald Trump spins up stories at least four ways:

Through constant characterization of others, for example with nicknames (“Little Mario,” “Low Energy Jeb,” “Crooked Hillary,” “Sleepy Joe,” “Failing New York Times”) By finding or creating problems, and characterizing those too: “witch hunt,” “fake news,” “illegal ballots,” “Dominion-izing the Vote.” By creating movement via the Roy Cohn and Roger Stone playbook: always attack and counter-attack, sue constantly, and claim victory no matter what. (Roy Cohn was a lawyer Frank Rich felicitously called “The worst human being who ever lived … the most evil, twisted, vicious bastard ever to snort coke at Studio 54.” Talk about character: Cohn was absolutely interesting. As Politico puts it here, “Cohn imparted an M.O. that’s been on searing display throughout Trump’s ascent, his divisive, captivating campaign, and his fraught, unprecedented presidency. Deflect and distract, never give in, never admit fault, lie and attack, lie and attack, publicity no matter what, win no matter what, all underpinned by a deep, prove-me-wrong belief in the power of chaos and fear.”) By playing the ultimate alpha. That’s why he constantly calls himself the winner, and “loser” is one of his favorite epithets. By de-legitimizing facts, truths, norms, and those who traffic in them. Key to this is accusing others of wrongs he commits himself. This is why he labels CNN and other news organizations “fake news” while raising falsity to an art form. Often, his accusations against others are a reliable tell of his guilt for doing the same thing. As for movement, every new problem Trump creates or intensifies is meant to generate an emotional response, which is movement in itself.

Look closely: Trump’s news is deliberate, theatrical, and constant. All of it is staged and re-staged, so every unavoidably interesting thing he says or does pushes the last thing he said or did off the stage and into irrelevance, because whatever he’s saying or doing now demands full attention, no matter what he said or did yesterday.

There is genius to this, and it requires understanding and respect—especially by those who report on it.

You can call this trolling, or earned media coverage, meaning the free kind. Both are true. Comparing Trump to The Mule in Isaac Asimov’s Foundation and Empire also makes sense. (The Mule was a mutant with exceptional influence over the emotions of whole populations. It was by noting this resemblance that I, along with Scott Adams, expected Trump to win in 2016.)

This speaks of two big fails for journalism:

Its appetite for stories proves a weakness when it’s fed by a genius at hogging the stage. It avoids reporting what doesn’t fit the story format. This includes most of reality.

My favorite priest says “Some truths are so deep only stories can tell them,” and I’m sure this is true. But stories by themselves are also inadequate ways to present essential facts people need to know because they exclude what doesn’t fit “the narrative,” which is the modern way to talk about story—and to spin journalists. (My hairs of suspicion stand on end every time I hear the word “narrative.”)

So here’s the paradox: We need to know more than stories can tell, yet stories are pretty much all people are interested in. Character, problem, and movement give shape and purpose to every human life. We can’t correct for it.

That’s why my topic here—a deep and abiding flaw (also a feature) of both journalism and human nature—is one most journalists won’t touch. The flawed nature of The Story itself is not a story. Same goes for  “earned media coverage.” Both are features rather than bugs because they cause much of journalism’s success and debugging them has proven impossible.

Ah, but we can debug the context of journalism, which is the world of facts—especially facts in the physical world. That’s why I’m sharing these CAL FIRE maps. Besides being extremely useful, they highlight how much more needs to be compiled and put in useful forms, so the stories journalism tells can be as fact-based as possible. In an upcoming post, I’ll talk about a new time and history-based way to gather, compile, and archive facts that can maximize their usefulness to journalism in the present, when deadlines rule. If you want to learn a bit about this approach, I outline it in The Future, Present, and Past of News (eleventh in the News Commons series).

Next post: The Blame Game.

Bonus links:

2 Bay Area climate tech startups seek to help first responders in LA County wildfire fight features is a good story on its own, and features old friend Jackson Yip and the work his new company is doing. (NBC Bay Area) As Los Angeles burned, Latino immigrants rushed in to put out fires (LAist) Home Losses From the LA Fires Hasten ‘An Uninsurable Future’ (Time) Los Angeles Needs to Fireproof Communities, Not Just Houses (Wired) Santa Barbara Talks: LA Journalist Tony Pierce on Fires, Media, Uber Column (Noozhawk)

 

 


The Pragmatic Engineer

Cross-platform mobile development

A deep dive into the most popular frameworks: React Native, Flutter, native-first, and web-based technologies, and how to pick the right approach

These days, it seems almost everyone owns a smartphone. In the US, 91% of adults have one, in Europe, this figure is 89%, while in India, Deloitte predicts 75% of adults will have a smartphone by 2026. In total, there are an estimated 4.8 billion smartphone users in the world, which is an incredible number! This means that for tech startups and tech businesses that build consumer products, it’s a baseline expectation for them to be usable on smartphones, and for there to be a mobile app for the product.

So, how do you build mobile apps? There’s plenty of choice: you can build a native mobile app for iOS using Swift or Objective C as a programming language, make one for Android using Java or Kotlin, and of course, you can develop a web app for desktop and mobile web users. All this adds up to three separate codebases and plenty of business logic replication.

Or you can do what startups like social media newcomer Bluesky did: have one codebase that powers the web, native iOS, and native Android apps. For Bluesky, a single developer wrote the initial version of all three apps using React Native and Expo. We cover more on this in the article, Inside Bluesky’s engineering culture.

There are cross-platform frameworks and approaches that offer a way to use a single codebase to power multiple native apps and a website. A decade ago, most cross-platform technologies were in their early stages, but things are shifting; in October 2022, we covered whether more cross-platform development could lead to less native iOS and Android hiring by startups.

Today’s article looks into current cross-platform development approaches, covering:

The ecosystem. Most apps remain fully native, with Flutter and React Native (RN) the clear leaders for cross-platform development. RN is more popular in the US and UK, and apps built with it tend to generate more money.

React Native. The framework of choice for many Meta, Microsoft, Amazon, and Shopify apps, and at places whose web teams work with React.

Flutter. Built and championed by Google, and the framework for most cross-platform apps.

Native-first approaches. Kotlin multiplatform, Swift-based frameworks (Skip, Scade), .NET MAUI (C#), NativeScript (JavaScript), and Unity.

Web-based frameworks. Cordova, Capacitor, Ionic, and Progressive Web Apps.

Choosing the right framework. A mental model for identifying key differences between all these technologies. In the end, most teams choose React Native or Flutter.

The bottom of this article could be cut off in some email clients. Read the full article uninterrupted, online.

Read the full article online

1. The ecosystem

What are the most popular cross-platform frameworks? Interesting research by Appfigures looked at all the top apps on the iOS App Store and Android’s Google Play, peeked into their binaries, and categorized them by the framework used:

The most popular frameworks for iOS and Android apps. Source: Appfigures

Other research published on Statista suggests Flutter and React Native are the most popular choices, followed by Cordova, Unity, and Ionic:

Cross-platform mobile adoption trends (2020-2023) based on a survey of 30,000 respondents. Flutter was used by 46% of respondents, and RN by 35% in 2023. Data source: Statista

Leaders: Flutter and React Native. These are the two most popular frameworks, but it can be tricky to identify which one is the most popular: on iOS, there are more React Native-powered, and on Android, Flutter apps outnumber React Native ones. However, there are simply more Android apps than iOS ones, which is why there are more Flutter apps than React Native ones, overall. React Native has been around since 2015, and Flutter since 2017.

Shrinking: Cordova and Ionic. As per the Statista survey, both frameworks have smaller but shrinking market shares, with about 10-12% of survey participants using them. Their usage is likely more common at companies which were building cross-platform apps before React Native and Flutter emerged, and remain content to ship WebView-based applications.

Growth potential: Kotlin Multiplatform (KMP). This technology has modest adoption rates, but seems to be gaining momentum. JetBrains is investing heavily in it, while the Kotlin language is popular with native mobile developers, especially with Android folks.

React Native or Flutter more popular?

New data from the 2024 Stack Overflow Developer Survey offers pointers. Below is a breakdown of the mobile cross-platform frameworks used by circa 6,500 respondents:

Cross-platform mobile framework usage by developers. Data source: Stack Overflow

From this data, it’s clear that Flutter and React Native are the most popular by a distance, with more users than all other frameworks, combined. But which is the most popular, overall? To find out, let’s slice and dice the data; firstly by narrowing it down to only professional developers by removing hobbyist users:

Framework usage by professional developers of cross-platform apps. Source: Stack Overflow Developer Survey

Flutter is used by slightly more engineers, though the difference is perhaps smaller than before. What happens if we consider per-country usage? Let’s start with the US, UK, Canada and Australia:

Flutter vs React Native usage by country. Source: Stack Overflow Developer Survey

Let’s look at other countries with a higher number of responses – the Netherlands, France, Poland, Brazil, Germany, and India:

Flutter vs React Native usage by country. Source: Stack Overflow Developer Survey

Developer framework preference seems to be linked to location. Germany and India somewhat prefer Flutter, while the US and UK tend towards React Native. I don’t have an explanation of the difference in preference by country; specifically: why Flutter is so much more favored in Germany, but React Native more popular in the US and UK. If you have any thoughts on this, please share in the comment section, below.

Leave a comment

Flutter powers more apps, but React Native ones earn more revenue. It’s hard to accurately measure developers’ preferences, but determining the number of apps using each framework is easier. Appfigures did exactly this by tracking all apps released in a year and found that Flutter was used by 11% of apps released in 2024, while 7% used React Native:

Around 11% of apps released in 2024 used Flutter, 7% used React Native, and 4% Unity. Source: Appfigures

Appfigures also estimates revenue generated by apps. Despite having fewer users, React Native-built apps on aggregate generated more net revenue ($287M) than Flutter apps ($283M), after Apple and Google’s 30% cuts.

The following section looks into each framework.

2. React Native

React Native appears to be the most popular cross-platform framework in the US and UK markets. What are the likely reasons for this popularity?

No need for frontend developers to learn a new language. Developers who know JavaScript or TypeScript will have no trouble getting started with React Native, and devs working with React will find the transition especially easy. As we know, React is the most popular frontend framework, with around 70% of frontend developers using it in 2024, as per the State of Frontend 2024 survey.

Easy enough to hire for. React’s popularity means it’s a bit easier to hire for this skillset, than for native iOS or Android developers. The challenge of hiring native developers was one reason Coinbase moved to React Native in 2021.

Expo. The Expo framework is built to simplify development with React Native, and is especially useful for teams. It helped boost adoption of React Native; without Expo, developers must set up both Android Studio and Xcode, manage emulators and simulators, and manage the native code. React Native’s own documentation recommends using Expo when getting started, as doing so without it makes the work several times more complex. Most of Expo is open source, but some services like Expo Application Services (EAS) have paid tiers.

React Native was open sourced by Facebook, in 2015. As the name suggests, this framework allows creating cross-platform apps using syntax similar to React applications. Here’s how a “Hello, World” app looks like using React Native:

A simple React Native example. Source: React Native documentation

React primitives render to native platform UI elements, which means the compiled app uses the same native UI elements as native iOS and Android apps.

Check out a behind-the-scenes peek into how the React.js documentary was created from two years ago.

Well-known React Native apps

Some popular apps built with this technology include:

Discord. The social platform moved to React Native in 2016 for iOS, and in 2018, two engineers rebuilt the iOS app in React Native at a time when the app already had millions of daily users. The team held off on moving to Android for performance reasons, until in 2022 they moved the Android app to React Native; sharing the same codebase, and keeping iOS and Android-specific UI for each platform.

It’s worth noting Discord often opts for cross-platform technologies: its desktop apps for Windows, Mac, and Linux are based on Electron; a cross-platform desktop technology based on JavaScript, HTML, and CSS.

Discord’s iOS and Android app. Source: Discord

Coinbase moved to React Native in 2021, when it had 56 million users and $1.8B in revenue. Moving off native to RN involved migrating more than 200 screens, and retraining more than 30 native-only engineers. Interestingly, Coinbase claimed that moving to RN reduced their cold start time from 3.8 seconds to 2.5 seconds (still quite a lot, to be fair!), and improved reliablity by increasing the crash-free rate from 99.4% to 99.7%.

A big motivation seemed to be to build more with fewer engineers, and make more consistent app experiences across platforms. The company labelled the transition a success: it reduced the number of codebases from 3 (iOS, Android and Web) to 2 (React Native and React Web), and web engineers could work on the mobile app, and mobile engineers on the web one.

Coinbase app built using React Native. Source: Coinbase

Shopify went all-in on React Native five years ago. Just this week, the company reflected on how it went, sharing:

More productivity: thanks to one codebase powering iOS and Android, and working across both apps

Performance and reliability: all pages have sub-500ms loads and crash-free rates are above 99.9%. Both are impressive numbers!

TypeScript for the win: using TypeScript makes it easy for devs to transfer between React and React Native.

There are downsides: debugging is worse than for native apps, and updating to new React Native versions can be painful.

Shopify employs more than 2,000 software engineers, and is one of the largest tech companies to go all-in on this technology. That it’s working for them, five years later, is a strong vote of confidence. Read more about Shopify’s 5-year reflections on RN.

Meta, Microsoft, and Amazon are not known for fully React-native apps, but do use plenty of RN functionality in their apps:

Meta: React Native’s creator utilizes it heavily for Facebook, Instagram, Ads Manager, Messenger, and Meta Horizon. The company recently shared that more than of its 5,000 engineers work with React code, and Meta apps have some clever React Native-related performance enhancements; for example, in the Facebook app, React Native is initialized when a user visits the first React Native surface, and not on app start. This allows for faster app startup.

Microsoft: the tech giant uses both React and React Native in products like Windows, XBox, Microsoft Office, Microsoft Teams, and other apps. The Windows maker is a heavy user of this technology for native performance and cross-platform code sharing reasons, as it said in 2022. Microsoft has also started to invest heavily in React Native for Windows and MacOS.

Amazon: parts of Amazon Shopping, Amazon Alexa, and Amazon Photos also utilize RN, as per the React Native showcase. Also, Amazon’s Kindle device uses it.

The home screen of Kindle is rendered with React Native, after Amazon migrated away from a Java-based UI in 2022.

The Kindle home screen is rendered using React Native. Source: Goodreader

There are plenty of other, well-known apps building on top of React Native. Bloomberg moved over to this framework shortly after it was launched, back in 2016.

Some high-profile apps that use React Native, even if not built exclusively with it. Source: React Native showcase 3. Flutter

Flutter was launched in 2017 by Google as a solution for cross-platform development. Initially, it targeted Android developers, allowing them to write code once for separate Android and iOS native applications.

Flutter uses the Dart programming language, a strongly-typed language with similar syntax to C# and Java. A clear downside of Flutter is the requirement to learn Dart. However, this is easy to pick up, especially with experience of Java or Kotlin. Here’s what a simple Flutter application looks like:

A simple Flutter app, written in Dart

RN uses native elements, but Flutter uses its own rendering engine called the Impeller rendering engine. This design choice means Flutter offers consistent UI experience across iOS and Android – and even the web! The rendering engine and the programming language of choice are the biggest differences compared to React Native, and native development. Here is how Jordan Bonnet – formerly the first mobile engineer at Uber and current founder of Flutter user onboarding startup fluo.dev – explained this difference to me:

Summarizing the differences between the three platforms. The mental model shared by Jordan Bonnet, cofounder of fluo.dev Performance: where Flutter flies

Read more

Monday, 13. January 2025

Kerri Lemole

Explaining Verifiable Credentials and Open Badges 3.0

Part 2: Issuing Badges In part one of this series, we discussed how the trust model of Open Badges 3.0 differs from earlier versions. Credentials issued as Open Badges 3.0 are based on the W3C Verifiable Credentials data model which describes how claims made online can be proven. To explore this trust model, let’s explore the story of Olivia who has been taking classes at a local commu
Part 2: Issuing Badges

In part one of this series, we discussed how the trust model of Open Badges 3.0 differs from earlier versions. Credentials issued as Open Badges 3.0 are based on the W3C Verifiable Credentials data model which describes how claims made online can be proven.

To explore this trust model, let’s explore the story of Olivia who has been taking classes at a local community college. Olivia is told that with each course she will be awarded and issued a badge that she can use to apply for employment.

Let’s start with what “issuing a badge” means using the most common approaches:

For older Open Badges (versions 0.5 to 2.1): The issuer saves completed course data in the format of an Open Badge that includes Olivia’s email address. When the issuer saves the data, the badge is considered “issued”. This data will be used to populate a web page or a file on the web. This web page or file must be public for Olivia to share online and be considered verifiable. In order for Olivia to do this, she must have an account on the issuing platform associated with the same email address.

For Open Badges 3.0 & W3C Verifiable Credentials: The issuer decides that Olivia should be awarded a badge and provides a secure link for Olivia to retrieve it. Olivia uses a digital wallet to click the secure link which makes a request to the issuing software to produce the badge. The issuing software retrieves and digitally signs the badge data; at this moment the badge has been by definition “issued.” (Prior to this digital signature, the badge itself did not exist.) It is sent to Olivia’s wallet where she claims and secures it, now able to share it privately at her discretion. It is important to note that this credential is hosted locally on Olivia’s device. There is no web page or URL for this credential until Olivia decides to create a public link.

Different Trust Models Older Open Badges are hosted on the web. In this model, trust in the credential is based on the reputation of the badging platform hosting the credentials. Perceptions of trust of websites or platforms can be based on elements like issuer identity confirmation, availability of content, and security of the web site (how websites are secured). Open Badges 3.0 and Verifiable Credentials are digitally signed, portable files. In this model, trust is based on the cryptographic digital signature and privacy. The digital signature in the credential is readable to the verification software, which looks for information indicating which entity signed the credential and if the data has changed since it was signed. Confirmation of the issuing entity’s identity happens later using an issuer registry (to be explained in more detail in a future post).

Olivia has unfettered access to the credential and can share it knowing that when she does, her actions aren’t being tracked or communicated to the issuer. When she shares her credential, it is signed by the digital identity associated with the credential issuer so that the verifier can trust that she has control of the digital identity and the credential.

It’s worth noting that Open Badges v3 and Verifiable Credential issuing platforms could still have public web pages for badges. Olivia may not want all of her badges digitally signed but with Open Badges 3.0 and Verifiable Credentials, she has that choice.

This blog post is Part 2 of a series of content explaining W3C Verifiable Credentials and Open Badges 3.0. Stay tuned for Part 3 — “What does it mean that a badge is digitally signed?”.

For more information on the DCC, including membership, projects, and events, visit our website, wiki, and Linkedin. To receive our newsletter and invitations to webinars and events sign up for our community mailing list.

Explaining Verifiable Credentials and Open Badges 3.0 was originally published in Digital Credentials Consortium on Medium, where people are continuing the conversation by highlighting and responding to this story.

Sunday, 12. January 2025

The Tao of XDI

Is anybody out there?

Is blogging still a thing? If you’re reading this, please comment or like it so I know. I am, at heart, still the XDI Guy. My passion for robust, secure, and scalable distributed data management hasn’t waned. Building one of the first XDI implementations (shout-out to Markus Sabadello, who might have built the only other one), I learned a lot about the challenges and opportunities of distribu

Is blogging still a thing?

If you’re reading this, please comment or like it so I know.


I am, at heart, still the XDI Guy. My passion for robust, secure, and scalable distributed data management hasn’t waned. Building one of the first XDI implementations (shout-out to Markus Sabadello, who might have built the only other one), I learned a lot about the challenges and opportunities of distributed data at scale. Over the years, I’ve reflected on qualities essential for data ecosystems, qualities that are often overshadowed by content-driven ecosystems. For example:

• Caching semantics: Apps need governance and management of caches to respect data ownership while maintaining local operational efficiency.

• Transactionality: Mature data protocols depend on it for consistency and reliability.

• Request batching: Optimizing network requests is vital for performance and scalability.


After years away, I’m ready to dive back in. There are a couple of apps I want to build, and I want to build them on a distributed data platform. My first idea is a fully distributed implementation of Brad deGraf’s BooksWeLike concept—a way to share and discover books that resonate with you. (Brad, if you’re reading this and don’t like the idea, let me know!)


To make this happen, I’ve started scanning the distributed protocol space to see what’s out there. Here’s my initial list of protocols to evaluate:

• AT Protocol

• Nostr

• Solid

• ActivityPub

• Matrix

• IPFS (InterPlanetary File System)

• SSB (Secure Scuttlebutt)

• DID (Decentralized Identifiers)

• Libp2p

• Hypercore

• Waku

• Zero-Knowledge Identity Protocols (ZK Protocols)


What am I missing?

Are there protocols on this list that don’t belong? If so, why? Are there others I should consider? I haven’t started my evaluations yet, so I’m open to your insights. If you’ve built apps on these protocols or have opinions about them, I’d love to hear from you.


I’ll be capturing my journey of discovery here—sharing what I learn, where I stumble, and how I (hopefully) succeed. Let’s make distributed data a reality, by and for the people.


IdM Laboratory

ECDSAに対応したゼロ知識証明の論文がGoogleから出ています

こんにちは、富士榮です。 AAMVAのモバイル運転免許証のガイドラインでも触れましたが、mdocやSD-JWTのリンク可能性へ対応するためには今後ゼロ知識証明が大切になります。 年末にGoogleの研究者が Anonymous credentials from ECDSA というタイトルでペーパーを出しています。 https://eprint.iacr.org/2024/2010 AIでイラスト生成すると色々とおかしなことになって面白いですねw アブストラクトの中からポイントを抜粋すると、従来のBBS+では暗号スイートへの対応に関する要件が厳しかったのでレガシーで対応できるようにECDSAでもできるようにしたよ、ということのようですね。 Part of the difficulty arises because schemes in the literature,

こんにちは、富士榮です。

AAMVAのモバイル運転免許証のガイドラインでも触れましたが、mdocやSD-JWTのリンク可能性へ対応するためには今後ゼロ知識証明が大切になります。

年末にGoogleの研究者が

Anonymous credentials from ECDSA

というタイトルでペーパーを出しています。

https://eprint.iacr.org/2024/2010

AIでイラスト生成すると色々とおかしなことになって面白いですねw

アブストラクトの中からポイントを抜粋すると、従来のBBS+では暗号スイートへの対応に関する要件が厳しかったのでレガシーで対応できるようにECDSAでもできるようにしたよ、ということのようですね。

Part of the difficulty arises because schemes in the literature, such as BBS+, use new cryptographic assumptions that require system-wide changes to existing issuer infrastructure.  In addition,  issuers often require digital identity credentials to be *device-bound* by incorporating the device’s secure element into the presentation flow.  As a result, schemes like BBS+ require updates to the hardware secure elements and OS on every user's device.

その難しさの一部は、BBS+などの文献に記載されているスキームが、既存の発行者インフラストラクチャにシステム全体にわたる変更を必要とする新しい暗号化前提条件を使用していることに起因しています。さらに、発行者は、デバイスのセキュアエレメントを提示フローに組み込むことで、デジタルID認証をデバイスに紐づけることを求めることがよくあります。その結果、BBS+のようなスキームでは、すべてのユーザーのデバイスのハードウェアセキュアエレメントとOSのアップデートが必要になります。

In this paper, we propose a new anonymous credential scheme for the popular and legacy-deployed Elliptic Curve Digital Signature Algorithm (ECDSA) signature scheme.  By adding efficient zk arguments for statements about SHA256 and document parsing for ISO-standardized identity formats, our anonymous credential scheme is that first one that can be deployed *without* changing any issuer processes, *without* requiring changes to mobile devices, and *without* requiring non-standard cryptographic assumptions.

本稿では、広く普及し、レガシーシステムにも導入されている楕円曲線デジタル署名アルゴリズム(ECDSA)署名スキームのための新しい匿名クレデンシャルスキームを提案する。 SHA256に関する効率的なzk引数と、ISO標準化されたIDフォーマットの文書解析を追加することで、この匿名クレデンシャルスキームは、発行者側のプロセスを変更することなく、モバイルデバイスの変更を必要とすることなく、また、非標準の暗号化前提条件を必要とすることなく実装できる初めてのスキームです。

 なかなか期待できますね。生成速度に関してもこのような記載があります。

Our proofs for ECDSA can be generated in 60ms.  When incorporated into a fully standardized identity protocol such as the ISO MDOC standard, we can generate a zero-knowledge proof for the MDOC presentation flow in 1.2 seconds on mobile devices depending on the credential size. These advantages make our scheme a promising candidate for privacy-preserving digital identity applications.

当社のECDSAの証明書は60ミリ秒で生成できます。ISO MDOC標準のような完全に標準化されたアイデンティティプロトコルに組み込まれた場合、クレデンシャルのサイズにもよりますが、モバイルデバイス上でMDOCプレゼンテーションフロー用のゼロ知識証明書を1.2秒で生成できます。これらの利点により、当社の方式はプライバシー保護型デジタルアイデンティティアプリケーションの有望な候補となっています。

mdocのプレゼンテーション時にゼロ知識証明を1.2秒で生成、このくらいなら実用性がありそうですね。

論文の本文もPDFで閲覧できるようになっているので、おいおい見ていこうと思います。

 

 


Saturday, 11. January 2025

Werdmüller on Medium

Building an open web that protects us from harm

It’s not enough to be neutral. We must be allies. Continue reading on Medium »

It’s not enough to be neutral. We must be allies.

Continue reading on Medium »


Michael Ruminer

AI RAG with LlamaIndex, Local Embedding, and Ollama Llama 3.1 8b

In this post, I cover using LlamaIndex LlamaParse in auto mode to parse a PDF page containing a table, using a Hugging Face local embedding model, and using local Llama 3.1 8b via Ollama to perform naive Retrieval Augmented Generation (RAG). That’s a mouthful. I won’t go into how to setup Ollama and Llama 3.1 8b; this post assumes it is running. First off, you can find the code for this in m

In this post, I cover using LlamaIndex LlamaParse in auto mode to parse a PDF page containing a table, using a Hugging Face local embedding model, and using local Llama 3.1 8b via Ollama to perform naive Retrieval Augmented Generation (RAG). That’s a mouthful. I won’t go into how to setup Ollama and Llama 3.1 8b; this post assumes it is running.

First off, you can find the code for this in my LlamaIndex_Test Github repo under Test1/src folder. At the time of this writing there is a Test0 and a Test1. To see the post about Test0 code see Using LlamaIndex — Part 1 OpenAI.

The code uses a .env and load_dotenv() to populate the needed LLAMA_CLOUD_API_KEY. I recommend that if you have an OPENAI_API_KEY entry in the .env that you comment it out for this experiment to prove to yourself that the embedding and LLM are local and not OpenAI. See the part 1 post for more details on the LLAMA_CLOUD_API_KEY.

#OPENAI_API_KEY=YOUR_API_KEY
LLAMA_CLOUD_API_KEY=YOUR_API_KEY

The pip install dependencies I put as comments at the top of the python file. There is also a requirements.txt for the project as a whole that covers all the “Test” experiments package requirements.

# pip install llama-index-embeddings-huggingface
# pip install llama-index-llms-ollama
# pip install llama-index-core llama-parse llama-index-readers-file

The nice thing about LlamaIndex LlamaParse is that it provides an auto mode that will use premium mode when specified criteria are met. In this experiment, I have set auto mode on with triggers for mode change on in- page images or tables. Also, to save on parsing credit usage in LlamaParse and because, for this example, it is all that is needed, I have set the pages to be parsed to PDF page 9 only (note that PDF page 9 is target page 8 to LlamaParse because it uses a 0 based page index). Like the part 1 post, I am using an output of markdown because it provides greater context to the LLM; though, I did try it with result_type=text and received the proper query response despite the answer to the query being in a table.

# set LlamaParse for markdown output and auto_mode only parsing page 8
parser = LlamaParse(
result_type="markdown",
auto_mode=True,
auto_mode_trigger_on_image_in_page=True,
auto_mode_trigger_on_table_in_page=True,
target_pages="8",
verbose=True
)

So that you don’t have to open the PDF document that gets parsed to understand the input below is a screenshot of the page.

As in part 1, I use LlamaParse.load_data to read the page and parse it. Since it has a table in-page and we are in auto mode it will automatically use Premium mode to potentially better handle the page and table. This will cause the page parse to cost 15 credits on LlamaIndex. Note that LlamaIndex will cache your parsed page for 48 hours unless you specify otherwise or change the parse parameters which allows you to run the code more than once and only get the credit cost once. I did try using the default “accurate” mode by removing the auto_mode parameters on the LlamaParse and it still parsed the table properly and returned the proper answer to the query — but this is a sample for showing the use of “auto mode” so just pretend that is not the case.

If you want to see the output of the parser, uncomment the print command after the documents variable is populated. I like to then paste it into a markdown viewer to see it as rendered markdown output. See the below image for that output.

with open(f"../../sample_docs/{file_name}", "rb") as file_to_parse:
# LlamaParse will cache a parsed document 48 hours if the parse parameters are not changed
# thus not incuring additional parse cost if you run this multiple times for testing purposes
# see the history tab in the LlamaParse dashboard for the project to confirm that
# credits used = 0 for subsequent runs
#
# must provide extra_info with file_name key when passing file object
documents = parser.load_data(file_to_parse, extra_info=extra_info)
# to manually check the output uncomment the below
#print(documents[0].text)

I like to set the default settings for LLM and embedding model so that I don’t need to pass them around as parameters. Here is where I set the embedding model to a Hugging Face provided model. When you run the python for the first time it will pull down the embedding model automatically — nice!

# set the default embeddings and llm so that it doesn't have to be passed around
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
Settings.llm = Ollama(model="llama3.1:latest", request_timeout=120.0)

The next part of the code does the same that it did in Part 1 except that this time the VectoreStoreIndex and the query engine use the models I set in the Settings singleton versus the LlamaIndex default of OpenAI.

# index the parsed documents using the default embedding model
index = VectorStoreIndex.from_documents(documents)

# generate a query engine for the index using the default llm
query_engine = index.as_query_engine()

# provide the query and output the results
query = "What is the latency in seconds for Nova Micro?"
response = query_engine.query(query)
print(response)

If all goes well you should get the response output as 0.5 and if you look back at the table from the page you’ll see that is correct.

(.venv) PS C:\python\LlamaIndex_Test\Test1\src> python parse_ollama.py
Started parsing the file under job_id 37dce328-aaa7-499b-afe9-498c32b63944
.0.5

To validate that the value was coming from the RAG provided PDF page and not the the LLMs inherent “knowledge”, I asked a similar question via the command line to Ollama without providing the RAG context— output below:

PS C:\temp> ollama run llama3.1:latest "what is the latency in seconds for Nova Micro Amazon LLM model?"
I don't have access to specific information about the latency of the Nova Micro Amazon LLM (Large Language Model)
model. The details regarding models like this, especially concerning their performance metrics such as latency,
are typically available from the developers or through official documentation and may be subject to change. If
you're looking for accurate and up-to-date information on this topic, I recommend checking directly with Nova
Micro's resources or contacting them for the most current data.

There you have it. But I am not done quite yet in reporting my results. In LlamaIndex’s examples, they used this PDF but used PDF page 1 which contains an image. See below an image of the page.

They use this page to demonstrate how LlamaParse in auto mode moves into premium mode for the page parsing because of the image and then creates a mermaid diagram from the image because it recognizes the image is of a diagram. Below is what they report as the outcome in part.

# The Amazon Nova Family of Models:
# Technical Report and Model Card

Amazon Artificial General Intelligence

```mermaid
graph TD
A[Text] --> B[Nova Lite]
C[Image] --> B
D[Video] --> E[Nova Pro]
F[Code] --> E
G[Docs] --> E
B --> H[Text]
B --> I[Code]
E --> H
E --> I
J[Text] --> K[Nova Micro]
L[Code] --> K
K --> M[Text]
K --> N[Code]
O[Text] --> P[Nova Canvas]
Q[Image] --> P
P --> R[Image]
S[Text] --> T[Nova Reel]
U[Image] --> T
T --> V[Video]

style B fill:#f9f,stroke:#333,stroke-width:2px
style E fill:#f9f,stroke:#333,stroke-width:2px
style K fill:#f9f,stroke:#333,stroke-width:2px
style P fill:#f9f,stroke:#333,stroke-width:2px
style T fill:#f9f,stroke:#333,stroke-width:2px

classDef input fill:#lightblue,stroke:#333,stroke-width:1px;
class A,C,D,F,G,J,L,O,Q,S,U input;

classDef output fill:#lightgreen,stroke:#333,stroke-width:1px;
class H,I,M,N,R,V output;
```

Figure 1: The Amazon Nova family of models

When I tried this I did not get the same outcome from the parse. It did not even attempt to generate a mermaid diagram. I received the following output for the diagram image section; far from their professed output.

The Amazon Nova Family of Models:
Technical Report and Model Card
Amazon Artificial General Intelligence
Nova
Lite Nova
Nova Micro Ix
Pro <l> <l > </>
A Ix
</>
=
Nova Nova
Canvas Reel
Figure 1: The Amazon Nova family of models

In the experiment, everything is local except LlamaIndex which is nice. I hope that this example is of use to you.

Friday, 10. January 2025

Wrench in the Gears

How Gregg Braden Led Me To Share My Thoughts on Quantum Coherence, Collective Computing, and Tokenized Spirituality

This is a video I posted yesterday. And a follow up to clarify certain points. Sorry, I guess the youtube AI choose the “hands up” screenshot for the preview image. I do talk a lot with my hands. Here is a link to Joseph Gonzalez’s blog post on emergent information communication systems. Here are two [...]

This is a video I posted yesterday.

And a follow up to clarify certain points. Sorry, I guess the youtube AI choose the “hands up” screenshot for the preview image. I do talk a lot with my hands.

Here is a link to Joseph Gonzalez’s blog post on emergent information communication systems.

https://bantamjoe.com/2025/01/08/emergent-intelligent-communication-system/

Here are two related links (here and here) in my blog about web3 tokens as a form of symbolic language.

https://wrenchinthegears.com/2023/02/18/gods-eye-view-part-4-the-language-of-heartless-charity/ https://wrenchinthegears.com/2023/10/01/emergent-thoughts-on-web3-game-mechanics-and-tokens-as-a-computational-language-triggered-by-the-film-arrival/ https://wrenchinthegears.com/wp-content/uploads/2025/01/Invitation-to-Ontopoetics-Freya-Mathews.pdf

Below are several maps related to topics discussed for reference / exploration.

J.G. Bennett’s “lifelong learning communities” based on Gurdjieff’s principles in Surrey, Gloucestershire, and West Virginia.

https://embed.kumu.io/55da1b35ad86a647f722575c3f6a5c88#untitled-map?s=bm9kZS1FcnlES2JIYQ%3D%3D

Peter Drucker, Claremont, management theorist, and megachurch promoter with Texas cable TV mogul Bob Buford.

https://embed.kumu.io/840621d7eeadc3a92eb1b1f58848f2d6#untitled-map?s=bm9kZS1DaXRiWmxsMg%3D%3D

Jay Daniels, emeritus professor of religion and advocate of process philosophy from Hendrix College in Conway, Arkansas not far from Petit Jean State Park and Winthrop Rockefeller’s ranch.

https://embed.kumu.io/d0ce16570c05d58b2dcf14474e7c8482#untitled-map?s=bm9kZS1rU2tSTHc3VQ%3D%3D

Macy Conferences and Tavistock in relation to research into the psychology of group dynamics, cybernetics, and cognition.

https://embed.kumu.io/80de775f15b564883921a5ec7daf869b

Tokenized creativity and social prescribing in the context of “pay for success” wellness metrics and public “health.”

https://embed.kumu.io/ed4cd18205e44f10f71a103c7dafb250#untitled-map?s=bm9kZS1idlRxbk81cQ%3D%3D

Parallel community currencies and socio-technical systems.

https://embed.kumu.io/044479643a1b94c072b57dbd21ce8e7f

Tokenized emotional cartography linked to crowd-sourced computation.

https://embed.kumu.io/c93cb83f097f7f28ff88ade75d965919

 

Tokenized prediction markets in public policy linked to distributed cognition / swarm intelligence.

https://embed.kumu.io/789febca28a348cfa3cde24f44d2571e#untitled-map?s=bm9kZS1YNzJkZkgwUQ%3D%3D

Rise of second order cybernetics applied to the tokenized evolution of social systems under web3 spatial computing.

https://embed.kumu.io/5783ad4a7b17672e7cf8959fd5339231#untitled-map?s=bm9kZS1uUTE0MTd4cA%3D%3D

Aldous Huxley’s ties to psychedelics, human potential, mysticism, and the formation of AA.

https://embed.kumu.io/c4ecf9784bf71a6a64050c6c2d7be94b#untitled-map?s=bm9kZS10bEZNbEQ3Ng%3D%3D

Lots to chew on. Have at it!

 

 

 

Thursday, 09. January 2025

Phil Windleys Technometria

Authorization Matters

In 2013, poor authorization practices led to one of the largest data breaches in history.

In 2013, poor authorization practices led to one of the largest data breaches in history. Over a three-week period, hackers stole 40 million credit and debit card accounts from retail giant Target by exploiting the account of an HVAC contractor. The attack began with an email-based phishing scam that tricked an employee at the HVAC company into revealing their credentials. As a vendor to Target, the HVAC company had legitimate access to Target’s systems. Using the stolen credentials, the hackers infiltrated Target’s network and installed malware on the retailer’s point-of-sale systems, enabling them to collect customer data. The breach resulted in direct costs to Target of nearly $300 million, along with reputational damage that is difficult to quantify.1

The Target breach underscores the critical importance of not only knowing who is accessing your systems (authentication) but also controlling what they can access (authorization). The HVAC vendor had access to Target’s systems for electronic billing, contract submission, and project management. However, the hackers were able to extend this access beyond those legitimate uses to reach systems used to update point-of-sale devices. Target’s failure to properly restrict access to critical systems created a significant vulnerability that the hackers exploited.

But I don’t want you to get the idea that authorization is just about securing IT systems—it’s also fundamental to how modern cloud applications function, enabling features that wouldn’t be possible otherwise. For example, services like Google Docs and Dropbox rely on authorization to allow users to share documents with specific people while restricting access to others. Without this capability, these applications would not be possible.

Another example is Amazon Web Services (AWS), where authorization enables Amazon to securely provide hundreds of cloud services to millions of customers at scale in a cost-effective manner. As a global distributed system taking advantage of multi tenancy to service millions of customers, AWS uses a sophisticated policy-based authorization system to help customers control access to the services running inside their accounts.

Several trends make authorization a more important topic that it has been:

More and more software is being delivered as a service (so called Software as a Service, or SaaS). As we’ve discussed cloud-based services can’t provide the needed levels of functionality and security without strong authorization systems.

Perimeter-based security architectures like firewalls are giving way to zero-trust architectures. Rather than assuming everyone inside the perimeter is trusted, zero-trust systems assume breach and authorize every access. This isn’t possible without flexible authorization systems that can determine whether to grant access based on the context of the request.

Internet of Things (IoT) and interconnected devices mean that more and more computer systems are talking to each other on their owner’s behalf. Restricting their access to the role they play protects them and prevents them from becoming a vector that hackers can exploit to break into other sensitive systems.

Regulatory compliance is yet another reason why authorization is increasingly important. Laws like the Health Insurance Portability and Accountability Act (HIPAA) and General Data Protection Regulation (GDPR) require that access to personal information be carefully controlled so that only the right people and systems can use it. Beyond controlling access, these laws also require that you be able to audit access to show compliance.

In addition to these, the rise of GenAI and particularly AI agents that can act on your behalf pour gasoline on the smoldering identity industry interest in authorization. Am I talking to a human? Does this agent have authority to make a doctor's appointment on the patient's behalf? These and other questions mean that making good authorization decisions, knowing what decisions were made, and easily incorporating authorization infrastructure into your apps and infrastructure is vital.

Notes

The story is interesting beyond the fact that it highlights the need to properly manage access. Target’s response, or rather lack of one, provides several important security lessons: Cyber Case Study: Target Data Breach provides more details

Photo Credit: Hacker Stealing Credit Cards from DALL-E (public domain). Prompt: Draw a wide picture of a hacker stealing credit cards from an online store.


The Pragmatic Engineer

The Pulse #119: Are LLMs making StackOverflow irrelevant?

Also: Apple fires staff over fake charities scam, AI models just keep improving, a middle manager burnout possibly on the horizon, and more.

The Pulse is a series covering insights, patterns, and trends within Big Tech and startups. Notice an interesting event or trend? Send me a message.

Today, we cover:

Industry pulse. Fake GitHub stars on the rise, Anthropic to raise at $60B valuation, JP Morgan mandating 5-day RTO while Amazon struggles to find enough space for the same, Devin less productive than on first glance, and more.

Are LLMs making StackOverflow irrelevant? Fresh data shows that the number of questions asked on StackOverflow are as low as they were back in 2009 – which was when StackOverflow was one years old. The drop suggests that ChatGPT – and LLMs – managed to make StackOverflow’s business model irrelevant in about two years’ time.

Apples fires staff over fake charities scam. In order to get around $4,000 per year in additional tax cuts, six Apple employees tried to defraud Apple – and the IRS. They were caught, fired, and now face prosecution. A reminder that getting “clever” with corporate perks can wreck otherwise lucrative careers at Big Tech.

AI models just keep improving rapidly. Two months after wondering whether LLMs have hit a plateau, the answer seems to be a definite “no.” Google’s Gemini 2.0 LLM and Veo 2 video model is impressive, OpenAI previewed a capable o3 model, and Chinese startup DeepSeek unveiled a frontier model that cost less than $6M to train from scratch.

Middle manager burnout incoming? A Forbes article suggests a broader middle manager burnout to come across most professional sectors. This could simply be a consequence of higher interest rates, teams growing less, and more pressure on managers. It’s tougher to be an engineering manager, than it has been during the 2010-2022 period, that’s for sure.

1. Industry pulse Fake GitHub stars

For years, GitHub stars have been used by a proxy for VC investors to gauge how much traction an open source project has. Projects with high traction were much more likely to attract investment because investors assumed that developers’ interest can eventually be monetized.

So, if an open source project could increase its chance of attracting funding by getting more stars, what do you think happened? According to recent research by researchers at Carnegie Mellon University, security platform Socket, and North Carolina State University, it’s exactly what you’d expect: projects are faking their GitHub stars.

The counterfeiting of GitHub stars seems to have exploded ~100x since just last year. Here’s data from the paper:

Source: 4.5 Million Suspected Fake Stars in GitHub

The report finds fake stars being used to promote malware repositories, video game cheats, and crypto bots. Some are likely used for growth hacking to secure investment, while some are deployed for “resume fraud:” making it seem a software engineer’s side project on GitHub is a lot more popular than it actually is!

This research is a reminder that GitHub stars can be easily purchased, and more repos are doing just this. GitHub does its part to make it harder to create and operate accounts to buy/sell stars: it has Trust & Safety and Platform Health groups that battle account spam and account farming and are known to suspend accounts that abuse its terms and conditions. Still, this is a cat-and-mouse game, and the ability to buy stars to boost perceived repo popularity is not going anywhere. The more GitHub cracks down on this, the more expensive purchasing those additional stars will likely become, though.

Update on 12 Jan 2025: updated the last sentence that previously stated that GitHub has no incentive to crack down on purchased stars — based on more details I received on what GitHub is doing to fight against such bad actors.

Anthropic to raise at massive $60B valuation?

Read more


Just a Theory

Sqitch 1.5.0

Sqitch v1.5.0: out now in all the usual places!

Released yesterday: Sqitch v1.5.0. This version the MySQL driver DBD::mysql with DBD::MariaDB, both for its better backward compatibility with MySQL as well as MariaDB driver libraries and for its improved Unicode handling. The Docker image likewise switched to the MariaDB mysql client. I expect no compatibility issues, but you never know! Please file an issue should you find any.

V1.5.0 also features a fixes for Yugabyte deployment, Oracle error handling, existing Snowflake schemas, connecting to MySQL/MariaDB without a database name, and omitting the checkit MySQL/MariaDB function when the Sqitch user lacks sufficient permission to create it. Sqitch now will also complain when deploying with --log-only and a deployment file is missing.

Find it in the usual places:

sqitch.org GitHub CPAN Docker Homebrew

Many thanks to everyone who has enjoyed using Sqitch and let me know in person, via email Mastodon, bug reports, and patches. It gratifies me how useful people find it.

More about… Sqitch MariaDB MySQL Yugabyte Oracle Snowflake

Wednesday, 08. January 2025

The Pragmatic Engineer

Confessions of a Big Tech recruiter

In today’s episode of The Pragmatic Engineer, I’m joined by Blake Stockman, former tech recruiter at Google, Meta, Uber, and YC.
Stream the Latest Episode

Available now on Spotify, YouTube and Apple. See the episode transcript at the top of this page.

Brought to You By

DX — DX is an engineering intelligence platform designed by leading researchers.

Vanta — Automate compliance and simplify security with Vanta.

In This Episode

In today’s episode of The Pragmatic Engineer, I catch up with one of the best tech recruiters I’ve had the opportunity to work with: Blake Stockman, a former colleague of mine from Uber. Blake built a strong reputation in the recruiting world, working at tech giants like Google, Meta, and Uber and Flexport. He also spent time with Y Combinator and founded his agency, where he helped both large tech companies and early-stage startups find and secure top talent. A few months ago, Blake did a career pivot: he is now studying to become a lawyer. I pounced on this perfect opportunity to have him share all that he’s seen behind-the-scenes in tech recruitment: sharing his observations unfiltered.

In our conversation, Blake shares recruitment insights from his time at Facebook, Google, and Uber and his experience running his own tech recruitment agency. We discuss topics such as:

• A step-by-step breakdown of hiring processes at Big Tech and startups
• How to get the most out of your tech recruiter, as a candidate
• Best practices for hiring managers to work with their recruiter
• Why you shouldn’t disclose salary expectations upfront, plus tips for negotiating
• Where to find the best startup opportunities and how to evaluate them—including understanding startup compensation
• And much more!

Takeaways

My biggest takeaways from this practical conversation:

1. Almost every offer has room for negotiation. So if you get extended an offer, consider negotiating at least a little! In Blake’s words:

“Try to understand what it is that the recruiter is working with. They're going to ask you to state what your compensation expectations are. Every candidate is best suited by responding respectfully: “I would like to see an offer that the company thinks constitutes the value that I have to bring to the table.” And sticking with this.

If any company is not willing to give you an offer because you didn't state your compensation expectations upfront — which I've never ever seen! — they're probably not the right company for you to work at. Ask them to put you a fair and honest offer on the table. “

2. GenAI is already changing recruitment. Here is how Blake describes the change he sees:

“Just like software engineers are realizing you can do some amazing things very quickly with AI: the same is true for recruitment.

Everyone on the job market has gotten messages from a recruiter where it feels like a pretty generic template. Something along the lines of “here’s some basic information, here’s the opportunity, please respond if you’d like.” The recruiter sends the same template to all candidates in an area with the “senior software engineer” title. I expect all of this to be automated by AI — at least for the basic things. A lot of the repetition, the generic work that recruiters do: a lot of it could be automated.

And then we see AI playing a more focused role in note taking, synthesizing information, consolidating notes across multiple interviewers.

Recruitment is becoming more automated and less personal, up until much later stages in the process. This is a generic trend, but AI is speeding it up.”

We covered more on how GenAI is impacting tech hiring in the deepdive How GenAI is reshaping tech hiring.

3. The larger a company, the more decentralized hiring decisions are. Startups and small companies typically have hiring managers — or founders! — make the hiring decision. However, larger tech companies often have hiring committees deciding whether an offer should be extended.

This is one reason smaller companies can move a lot faster in hiring. It also makes it trickier to figure out why exactly a larger company might have said “no” after an onsite round that felt went well: the input for the decision could have come from people not even in the loop!

4. A tip for engineering leaders ahead of a hiring drive: create a ‘hiring talent map!’ This is how Blake describes this:

“For example, when we were hiring for the Payments team: we'll decided that we needed to find a couple candidates working at companies where they have relevant Payments experiences, or they can ramp up quickly. We needed people who had understanding of our domain knowledge. So we mapped out these companies.

We then looked at this map, and asked: “okay, but which ones do we think also have like really good engineering talent that also, you know, would plant in really well with a broader engineering culture here?”

Based on these, we put together a more structured outreach plan. This helped us hire more focused, and faster.”

For more advice on preparing for tech interviews: take a look at a video I did a few years back: Confessions from a Big Tech Hiring Manager: Tips for Software Engineering Interviews

The Pragmatic Engineer deepdives relevant for this episode

How GenAI is reshaping tech hiring

Hiring software engineers

Hiring an Engineering Manager

Hiring Junior Software Engineers

Timestamps

(00:00) Intro

(01:40) Tips for working with recruiters

(06:11) Why hiring managers should have more conversations with recruiters

(09:48) A behind-the-scenes look at the hiring process at big tech companies

(13:38) How hiring worked at Uber when Gergely and Blake were there

(16:46) An explanation of calibration in the recruitment process

(18:11) A case for partnering with recruitment

(20:49) The different approaches to recruitment Blake experienced at different organizations

(25:30) How hiring decisions are made

(31:34) The differences between hiring at startups vs. large, established companies

(33:21) Reasons desperate decisions are made and problems that may arise

(36:30) The problem of hiring solely to fill a seat

(38:55) The process of the closing call

(40:24) The importance of understanding equity

(43:27) Tips for negotiating

(48:38) How to find the best startup opportunities, and how to evaluate if it’s a good fit

(53:58) What to include on your LinkedIn profile

(55:48) A story from Uber and why you should remember to thank your recruiter

(1:00:09) Rapid fire round

Resources & Mentions

Where to find Blake Stockman:

• LinkedIn: https://www.linkedin.com/in/blake-stockman/

Mentions during the episode:

• The Software Engineer's Guidebook: Audiobook: https://gergelyorosz.gumroad.com/l/dlzxi

• The Staff Engineer's Path: A Guide for Individual Contributors Navigating Growth and Change: https://www.amazon.com/Staff-Engineers-Path-Individual-Contributors/dp/1098118731/

• Flexport: https://www.flexport.com/

• YCs Work at a Startup site: https://www.workatastartup.com/

• Andreessen Horowitz’s portfolio page: https://a16z.com/portfolio/

• The Midas List: The World's Best Venture Capital Investors In 2024: https://www.forbes.com/lists/midas/

• The Three-Body Problem: https://www.amazon.com/Three-Body-Problem-Cixin-Liu/dp/0765382032

• State of the software engineering job market in 2024: https://newsletter.pragmaticengineer.com/p/state-of-eng-market-2024

• Hiring software engineers and engineering leaders from Big Tech (Part 1): https://newsletter.pragmaticengineer.com/p/hiring-from-big-tech

• Hiring software engineers and engineering leaders from Big Tech (Part 2): https://newsletter.pragmaticengineer.com/p/hiring-from-big-tech-part-2

Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com.


Heres Tom with the Weather

Hockey Trivia

As I just registered for a new season of beer league hockey starting this weekend, I thought about an episode of C’est comme ça que je t’aime I recently watched which is set in 1974 Sainte-Foy, Quebec City. Nearly all the 1974 Quebec cultural references are new to me and in this episode, there is an adventure in a hockey arena and you can pause the video in a few places and see signage in the

As I just registered for a new season of beer league hockey starting this weekend, I thought about an episode of C’est comme ça que je t’aime I recently watched which is set in 1974 Sainte-Foy, Quebec City. Nearly all the 1974 Quebec cultural references are new to me and in this episode, there is an adventure in a hockey arena and you can pause the video in a few places and see signage in the background for a junior hockey team called Ste. Foy Couillard that only existed from 1972-1974.

You can watch it at ICI Tou.tv (in Canada) or Amazon Prime (in U.S.).


Moxy Tongue

OYO AI

 With over 1000 AI communities deployed in 2024, the next AI cohort begins.. Coding, Computer Science, Artificial Intelligence, Entrepreneurial Skill Development, Teacher PD,  With mentor support of all learners, students and teachers. Unlimited learning by design: (own root)  OYO®  AI  by kidOYO®  at OYOclass.com, Educati

 With over 1000 AI communities deployed in 2024, the next AI cohort begins..












Coding, Computer Science, Artificial Intelligence, Entrepreneurial Skill Development, Teacher PD, 

With mentor support of all learners, students and teachers. Unlimited learning by design: (own root) 

OYO®  AI  by kidOYO®  at OYOclass.com, Educational Software Services.

Tuesday, 07. January 2025

Werdmüller on Medium

46 books

Every book is a portal out of here. Continue reading on Medium »

Every book is a portal out of here.

Continue reading on Medium »


The Pragmatic Engineer

Bug management that works (Part 2)

Making time for bug fixing, dedicated ‘goalies’, preventing bugs, and how to create a culture quality-focused culture which minimizes them

How do you deal with bugs in the software products you build? This topic is under-discussed, even though almost all software engineers deal with software bugs and regressions every week.

We reached out to two dozen engineering leaders and software engineers to learn about current, cutting-edge approaches to bug management. In Part 1 of this mini-series, we covered:

Catching bugs

Reporting bugs

Bug triage

“Don’t manage bugs: fix them!”

Zero-bugs policy

In this second, concluding article on this topic, we look into what typically happens after the bug triage stage:

Create time and space for bug fixing. Bug expiration dates, weekly ‘bug pickups’, bug fixing time budgets, can all help.

Dedicated bug fixers: ‘goalies’. An engineer dedicated to bug fixing for a couple of weeks may be known as a support engineer”, “being on bug duty”, “first line”. It’s a popular approach with its own challenges.

Fixing bugs properly. A root cause analysis to fix underlying causes is a pragmatic approach. Several teams opt to treat high-priority bugs as incidents.

Prevention is better than cure. Automated tests like unit, integration, end-to-end, and performance tests, coupled with CI/CD is a common approach. So is investing in other bug prevention approaches.

A quality culture for fewer bugs. It takes effort from engineering leadership to create a culture that prioritizes quality. At companies with this focus, tactics for this include bug metrics, operational reviews, and engineers not asking permission to do bug fixing.

Related deep dives on this topic to check out:

Shipping to production

Incident review and postmortem best practices

Healthy oncall practices

How Big Tech does Quality Assurance (QA)

QA approaches across the industry

Thank you to everyone who contributed insights to this article:

Anaïs van Asselt (senior QA engineer), Andrea Sipos (product leader, Felix Hageloh (lead engineer), Gus Fune (CTO), Hugo Valante (tech lead), Ignatius Nothnagel (Director of Platform Engineering), Ivan Tchomgue (People manager/product owner), Jason Diller (VP of Engineering), Jayesh Varma (Lead Android engineer), Marin Dimitrov (Head of Engineering), Matteo Galli (CTO), Maya Ziv (senior software engineer), Owain Lewis (Director of Engineering), Piotr Zolnierek (CTO), Neil Oler (senior software engineer), Rebecca Frost (QA leader), Rebecca Holm Ring (engineering leader), Ruben Weijers (engineering manager), Ryan Hanni (Director of Engineering), Serdar Biyik (engineering manager), Walter de Bruijn (Head of Engineering Productivity)

The bottom of this article could be cut off in some email clients. Read the full article uninterrupted, online.

Read the full article online

1. Create time and space for bug fixing

Fixing bugs when they happen is the single best approach, but unfortunately not always realistic. The next best thing is to ensure there’s enough time for engineers to fix problematic issues. Below are some approaches.

Fix bugs on the go

Several engineering leaders at smaller companies say their preferred approach is to simply fix bugs as they occur:

“We do continuous bug fixing: always balancing value creation (creating new features) with value loss prevention (removing the friction caused by bugs in existing features). – Marin Dimitrov, Head of Engineering at Manual

We prioritize fixing bugs over working on other things. Most of our bugs are cleared as they come in. We found this is more productive than having ‘bug fixing weeks’ – Gus Fune, CTO at Div Brands

This approach seems harder to do at growing or large companies, where some bugs need several teams to fix them, or it’s unclear who owns a bug.

Bug expiration dates

A clever approach is to set expiration dates for when a bug should be resolved. The closer this date gets, the higher its priority. Ruben Weijers, engineering manager at TomTom elaborates:

“All triaged bugs have an end-date and an owner. If a bug is past its end-date (meaning it ‘breaches’ this date), it becomes a release blocker, regardless of whether it's a low priority bug.”

Weekly ‘bug pickups’

Ryan Hanni, engineering director at Ontra:

“We have used a weekly bug pickup process. The way it worked was simple: pick up one bug per team, per week, and fix it! This helped our bug backlog stay reasonably sized. We would always pick a high priority bug until there were none left, then do this with Medium and Low priority ones.”

Time budgets

A common approach is to fix a percentage of devs’ time to be used for bug fixing on a sprint basis, weekly or monthly.

“At various companies we’ve used quotas, which refers to the minimum percentage of time invested in bug fixing and quality improvements for each sprint. SLO-like targets on the resolution time for a bug, based on its severity (critical / high / medium / low), may be a good ‘forcing function’ to help teams balance better building new features vs bugfixing and quality improvements: when the SLO targets are regularly exceeded, this may be a signal that the team needs to increase the time (quota) allocated to bugfixing until the balance can be restored again.” – Marin Dimitrov, Head of Engineering at Manual.

“We first add bugs onto our sprint, allocating around 10-15% of our velocity. We prioritize bugs reported from our Live/Production environment. This approach means that we balance delivering new features with fixing existing issues.” – Jayesh Varma, lead Android engineer at Barclays

Ignatius Nothnagel, director of platform engineering at LocalStack uses this approach, but advises against sprinting:

“I've seen two approaches actually work in the wild:

1. Dedicate a fixed, non-negotiable percentage of capacity during every sprint to bugs and improvements.

2. Not ‘sprinting.’ Drop the concept of sprints. In my experience, this works amazingly! It turns out that hiring responsible adults and holding them accountable for making the right trade-off decisions actually works.”

Bug sprints and bug days

Interestingly enough, the approach of ‘batching’ bug fixing into a few days or a week can be hit-and-miss: either it works well enough to be a regular thing, or teams drop it because the outcome disappoints.

Accounts of when it’s a hit:

“We do a quarterly ‘just do it day’ where all engineers get to work on whatever they want for a day. This usually ends up being quality of life (QOL) improvements, dev tooling, and refactoring/cleanup work. It’s everyone’s favorite holiday!” – Maya Ziv, senior software engineer at Pavilion

“Regular bug bashes and FixIt weeks have worked very well for teams I’ve worked at Uber, Hopin, Craft, and now Manual” – Marin Dimitrov, Head of Engineering at Manual

…and when it’s a miss:

“Bug fixing, keeping the lights on (KTLO), and other emergent work outside of regular product strategy increments happens on Fridays. It doesn’t work because a lot of this work won’t fit in a single Friday, and leads to lots of context switching, dead end effort, and wasted time.” – a software engineer at a small health tech company

“We avoid ‘fix it weeks’ in favor of continuous, weekly bug pickups. If our backlog gets too big, we meet with cross-functional stakeholders (PM, UX, Dev, QE) to divide up the bugs across teams and have them fixed within the next two weeks or so, working the bugs into their cycle as they see fit.” – Ryan Hanni, director of engineering at Ontra

Another criticism of regular bug sprints is that they incentivize engineers to not worry about bugs day to day because they know there’s a regular event for dealing with them. This can reduce motivation to keep software tidy and bug-free at all times.

Warranty sprints

A variation of regular bug sprints are ‘warranty sprints.’ These refer to spending a week or two on addressing incoming bugs and feedback about a freshly released feature. Figma similarly prioritizes bugs for newly released features, as covered in the “Fix all bugs for recently released features” section, but are more focused.

Jason Diller, VP of Engineering at Arteria AI, shares:

“For warranty sprints, we typically don’t shift a team off a project as soon as it ships. We expect and plan for feedback and bugs to be higher volume right after a delivery, and keep the team dedicated to addressing those for a sprint or two, rather than punting all of that to a backlog to be dealt with later.”

2. Dedicated bug fixers: ‘goalies’

At mid-sized and larger companies, a common approach to staying on top of bugs is for an engineer to focus only on bug-related work. This role goes by a couple of names: “goalie” is the most common one, as in a soccer goalkeeper. Other terms are “support engineer,” “being on bug duty”, “bug duty officer”, “first line”, and even “bug trooper”.

Read more

Sunday, 05. January 2025

The Pragmatic Engineer

How AI-assisted coding will change software engineering: hard truths

A field guide that also covers why we need to rethink our expectations, and what software engineering really is. A guest post by software engineer and engineering leader Addy Osmani

Hi, this is Gergely with a bonus issue of the Pragmatic Engineer Newsletter. In every issue, we cover topics related to Big Tech and startups through the lens of software engineers and engineering leaders. To get articles like this in your inbox, every week, subscribe:

Subscribe now

Happy New Year! As we look toward the innovations that 2025 might bring, it is a sure bet that GenAI will continue to change how we do software engineering.

It’s hard to believe that just over two years ago in November of 2022 was ChatGPT’s first release. This was the point when large language models (LLMs) started to get widespread adoption. Even though LLMs are built in a surprisingly simple way, they produce impressive results in a variety of areas. Writing code turns out to be perhaps one of their strongest points. This is not all that surprising, given how:

Programming involves far simpler grammar than any human language

There is a massive amount of high-quality training data for these LLMs to use, in the form of working source code, thanks to open source software and crawling GitHub and other free-to-access code repositories (this kind of crawling and training is happening, regardless of whether it is ethical or not)

Last year, we saw that about 75% of developers use some kind of AI tool for software engineering–related work, as per our AI tooling reality check survey. And yet, it feels like we’re still early in the tooling innovation cycle, and more complex approaches like AI software engineering agents are likely to be the center of innovation in 2025.

Mainstream media has been painting an increasingly dramatic picture of the software engineering industry. In March, Business Insider wrote about how “Software engineers are getting closer to finding out if AI really can make them jobless”, and in September, Forbes asked: “Are software engineers becoming obsolete?” While such articles get wide reach, they are coming from people who are not software engineers themselves, don’t use these AI tools, and are unaware of the efficiency (and limitations!) of these new GenAI coding tools.

But what can we realistically expect from GenAI tools for shaping software engineering? GenAI will change parts of software engineering, but it is unlikely to do so in the dramatic way that some previous headlines suggest. And with two years of using these tools, and with most engineering teams using them for 12 months or more, we can shape a better opinion of them.

Addy Osmani is a software engineer and engineering leader, in a good position to observe how GenAI tools are really shaping software engineering. He’s been working at Google for 12 years and is currently the Head of Chrome Developer Experience. Google is a company at the forefront of GenAI innovation. The company authored the research paper on the Transformers architecture in 2017 that serves as the foundation for LLMs. Today, Google has built one of the most advanced foundational models with Gemini 2.0 and is one of the biggest OpenAI competitors.

Addy summarized his observations and predictions in the article The 70% problem: Hard truths about AI-assisted coding. It’s a grounded take on the strengths and weaknesses of AI tooling, one that highlights fundamental limitations of these tools, as well as the positives that are too good to not adopt as an engineer. It also offers practical advice for software engineers from junior to senior on how to make the most out of these tools. With Addy’s permission, this is an edited version of his article, re-published, with more of my thoughts added at the end. This issue covers:

How developers are actually using AI. Very different usages for “bootstrappers” versus “iterators.” Perhaps a reason why one tool is unlikely to work equally well for both groups?

The 70% problem: AI's learning curve paradox. Lesser-talked-about challenges with AI: the “two steps back paradox,” the hidden cost of “AI speed,” and the “knowledge paradox.”

What actually works: practical patterns. AI-first draft, constant conversation, and “trust but verify” patterns.

What does this mean for developers? Start small, stay modular, and trust your experience.

The rise of agentic software engineering. A shift to collaborating with AI, multi-modal capabilities, autonomous but guided approaches, and an “English-first” development environment.

The return of software as a craft? The lost art of polish to return, and the renaissance of personal software.

Additional thoughts. A good time to refresh what software engineering really is and how it has been the dream of needing no developers since the 1960s. And still, demand for experienced engineers could well increase in the future, rather than decrease.

Addy’s name might ring familiar to many of you. In August, we published an excerpt from his new book, Leading Effective Teams. Addy also writes a newsletter called Elevate: subscribe to to get Addy’s posts in your inbox.

With this, it’s over to Addy:

After spending the last few years embedded in AI-assisted development, I've noticed a fascinating pattern. While engineers report being dramatically more productive with AI, the actual software we use daily doesn’t seem like it’s getting noticeably better. What's going on here?

I think I know why, and the answer reveals some fundamental truths about software development that we need to reckon with. Let me share what I've learned.

I've observed two distinct patterns in how teams are leveraging AI for development. Let's call them the "bootstrappers" and the "iterators." Both are helping engineers (and even non-technical users) reduce the gap from idea to execution (or MVP).

1. How developers are actually using AI The Bootstrappers: Zero to MVP

Tools like Bolt, v0, and screenshot-to-code AI are revolutionizing how we bootstrap new projects. These teams typically:

Start with a design or rough concept

Use AI to generate a complete initial codebase

Get a working prototype in hours or days instead of weeks

Focus on rapid validation and iteration

The results can be impressive. I recently watched a solo developer use Bolt to turn a Figma design into a working web app in next to no time. It wasn't production-ready, but it was good enough to get very initial user feedback.

The Iterators: daily development

The second camp uses tools like Cursor, Cline, Copilot, and WindSurf for their daily development workflow. This is less flashy but potentially more transformative. These developers are:

Using AI for code completion and suggestions

Leveraging AI for complex refactoring tasks

Generating tests and documentation

Using AI as a "pair programmer" for problem-solving

But here's the catch: while both approaches can dramatically accelerate development, they come with hidden costs that aren't immediately obvious.

2. The 70% problem: AI's learning curve paradox

A tweet that recently caught my eye perfectly captures what I've been observing in the field: Non-engineers using AI for coding find themselves hitting a frustrating wall. They can get 70% of the way there surprisingly quickly, but that final 30% becomes an exercise in diminishing returns.

Source: Peter Yang on X

This "70% problem" reveals something crucial about the current state of AI-assisted development. The initial progress feels magical: you can describe what you want, and AI tools like v0 or Bolt will generate a working prototype that looks impressive. But then reality sets in.

The two steps back pattern

What typically happens next follows a predictable pattern:

You try to fix a small bug

The AI suggests a change that seems reasonable

This fix breaks something else

You ask AI to fix the new issue

This creates two more problems

Rinse and repeat

This cycle is particularly painful for non-engineers because they lack the mental models to understand what's actually going wrong. When an experienced developer encounters a bug, they can reason about potential causes and solutions based on years of pattern recognition. Without this background, you're essentially playing whack-a-mole with code you don't fully understand.

The hidden cost of "AI Speed"

When you watch a senior engineer work with AI tools like Cursor or Copilot, it looks like magic. They can scaffold entire features in minutes, complete with tests and documentation. But watch carefully, and you'll notice something crucial: They're not just accepting what the AI suggests. They're constantly:

Refactoring the generated code into smaller, focused modules

Adding edge case handling the AI missed

Strengthening type definitions and interfaces

Questioning architectural decisions

Adding comprehensive error handling

In other words, they're applying years of hard-won engineering wisdom to shape and constrain the AI's output. The AI is accelerating implementation, but their expertise is what keeps the code maintainable.

Junior engineers often miss these crucial steps. They accept the AI's output more readily, leading to what I call "house of cards code" – it looks complete but collapses under real-world pressure.

A knowledge gap

The most successful non-engineers I've seen using AI coding tools take a hybrid approach:

Use AI for rapid prototyping

Take time to understand how the generated code works

Learn basic programming concepts alongside AI usage

Build up a foundation of knowledge gradually

Use AI as a learning tool, not just a code generator

But this requires patience and dedication, which is exactly the opposite of what many people hope to achieve by using AI tools in the first place.

The knowledge paradox

Here's the most counterintuitive thing I've discovered: AI tools help experienced developers more than beginners. This seems backward. Shouldn't AI democratize coding?

The reality is that AI is like having a very eager junior developer on your team. They can write code quickly, but they need constant supervision and correction. The more you know, the better you can guide them.

This creates what I call the "knowledge paradox":

Seniors use AI to accelerate what they already know how to do

Juniors try to use AI to learn what to do

The results differ dramatically

I've watched senior engineers use AI to:

Rapidly prototype ideas they already understand

Generate basic implementations they can then refine

Explore alternative approaches to known problems

Automate routine coding tasks

Meanwhile, juniors often:

Accept incorrect or outdated solutions

Miss critical security and performance considerations

Struggle to debug AI-generated code

Build fragile systems they don't fully understand

There's a deeper issue here: The very thing that makes AI coding tools accessible to non-engineers, their ability to handle complexity on your behalf, can actually impede learning. When code just "appears" without you understanding the underlying principles:

You don't develop debugging skills

You miss learning fundamental patterns

You can't reason about architectural decisions

You struggle to maintain and evolve the code

This creates a dependency where you need to keep going back to AI to fix issues, rather than developing the expertise to handle them yourself.

Implications for the future

This "70% problem" suggests that current AI coding tools are best viewed as:

Prototyping accelerators for experienced developers

Learning aids for those committed to understanding development

MVP generators for validating ideas quickly

But they're not yet the coding democratization solution many hoped for. The final 30%, the part that makes software production-ready, maintainable, and robust, still requires real engineering knowledge.

The good news? This gap will likely narrow as tools improve. But for now, the most pragmatic approach is to use AI to accelerate learning, not replace it entirely.

3. What actually works: practical patterns

After observing dozens of teams, here's what I've seen work consistently:

"AI first draft" pattern

Let AI generate a basic implementation

Manually review and refactor for modularity

Add comprehensive error handling

Write thorough tests

Document key decisions

"Constant conversation" pattern

Start new AI chats for each distinct task

Keep context focused and minimal

Review and commit changes frequently

Maintain tight feedback loops

"Trust but verify" pattern

Use AI for initial code generation

Manually review all critical paths

Conduct automated testing of edge cases

Implement regular security audits

4. What does this mean for developers?

Despite these challenges, I'm optimistic about AI's role in software development. The key is understanding what it's really good for:

Accelerating the known. AI excels at helping us implement patterns we already understand. It's like having an infinitely patient pair programmer who can type really fast.

Exploring the possible. AI is great for quickly prototyping ideas and exploring different approaches. It's like having a sandbox where we can rapidly test concepts.

Automating the routine. AI dramatically reduces the time spent on boilerplate and routine coding tasks, letting us focus on the interesting problems.

If you're just starting with AI-assisted development, here's my advice:

Start small

Use AI for isolated, well-defined tasks

Review every line of generated code

Build up to larger features gradually

Stay modular

Break everything into small, focused files

Maintain clear interfaces between components

Document your module boundaries

Trust your experience

Use AI to accelerate, not replace, your judgment

Question generated code that feels wrong

Maintain your engineering standards

5. The rise of agentic software engineering

The landscape of AI-assisted development is shifting dramatically as we head into 2025. While the current tools have already changed how we prototype and iterate, I believe we're on the cusp of an even more significant transformation: the rise of agentic software engineering.

What do I mean by "agentic"? Instead of just responding to prompts, these systems can plan, execute, and iterate on solutions with increasing autonomy.

If you’re interested in learning more about agents, including my take on Cursor/Cline/v0/Bolt, you may be interested in my recent JSNation talk above.

We're already seeing early signs of this evolution:

From responders to collaborators

Current tools mostly wait for our commands. But look at newer features like Anthropic's computer use in Claude, or Cline's ability to automatically launch browsers and run tests. These aren't just glorified autocomplete. They're actually understanding tasks and taking the initiative to solve problems.

Think about debugging: Instead of just suggesting fixes, these agents can:

Proactively identify potential issues

Launch and run test suites

Inspect UI elements and capture screenshots

Propose and implement fixes

Validate the solutions work (this could be a big deal)

The Multimodal future

The next generation of tools may do more than just work with code. They could seamlessly integrate:

Visual understanding (UI screenshots, mockups, diagrams)

Verbal language conversations

Environment interaction (browsers, terminals, APIs)

This multimodal capability means they can understand and work with software the way humans do: holistically, not just at the code level.

Autonomous but guided

The key insight I've gained from working with these tools is that the future isn't about AI replacing developers. It's about AI becoming an increasingly capable collaborator that can take initiative while still respecting human guidance and expertise.

The most effective teams in 2025 may be those that learn to:

Set clear boundaries and guidelines for their AI agents

Establish strong architectural patterns that agents can work within

Create effective feedback loops between human and AI capabilities

Maintain human oversight while leveraging AI autonomy

The English-first development environment

As Andrej Karpathy noted:

"The hottest new programming language is English."

This is a fundamental shift in how we'll interact with development tools. The ability to think clearly and communicate precisely in natural language is becoming as important as traditional coding skills.

This shift toward agentic development will require us to evolve our skills:

Stronger system design and architectural thinking

Better requirement specification and communication

More focus on quality assurance and validation

Enhanced collaboration between human and AI capabilities

6. The return of software as craft?

While AI has made it easier than ever to build software quickly, we're at risk of losing something crucial: the art of creating truly polished, consumer-quality experiences.

Source: Garry Tan on X The demo-quality trap

It's becoming a pattern: Teams use AI to rapidly build impressive demos. The happy path works beautifully. Investors and social networks are wowed. But when real users start clicking around? That's when things fall apart.

I've seen this firsthand:

Error messages that make no sense to normal users

Edge cases that crash the application

Confusing UI states that never get cleaned up

Accessibility completely overlooked

Performance issues on slower devices

These aren't just P2 bugs. They're the difference between software people tolerate and software people love.

The lost art of polish

Creating truly self-serve software, the kind where users never need to contact support, requires a different mindset:

Obsessing over error messages

Testing on slow connections

Handling every edge case gracefully

Making features discoverable

Testing with real, often non-technical users

This kind of attention to detail (perhaps) can't be AI-generated. It comes from empathy, experience, and deep care about craft.

The renaissance of personal software

I believe we're going to see a renaissance of personal software development. As the market gets flooded with AI-generated MVPs, the products that will stand out are those built by developers who:

Take pride in their craft

Care about the little details

Focus on the full user experience

Build for the edge cases

Create truly self-serve experiences

The irony? AI tools might actually enable this renaissance. By handling the routine coding tasks, they free up developers to focus on what matters most: creating software that truly serves and delights users.

The bottom line

AI isn't making our software dramatically better because software quality was (perhaps) never primarily limited by coding speed. The hard parts of software development — understanding requirements, designing maintainable systems, handling edge cases, ensuring security and performance — still require human judgment.

What AI does do is let us iterate and experiment faster, potentially leading to better solutions through more rapid exploration. But this will only happen if we maintain our engineering discipline and use AI as a tool, not as a replacement for good software practices. Remember: The goal isn't to write more code faster. It's to build better software. Used wisely, AI can help us do that. But it's still up to us to know what "better" means and how to get it.

Additional thoughts

Gergely again. Thank you, Addy, for this pragmatic summary on how to rethink our expectations on AI and software engineering. If you enjoyed this piece from Addy, check out his other articles and his latest book: Leading Effective Engineering Teams.

Here are my additional thoughts on AI and software engineering.

A good time to refresh what software engineering really is

Much of the disclosure on AI tooling for software engineering focuses on code generation capabilities, and rightfully so. AI tools are impressive in generating working code from prompts, or suggesting inline code as you build software. But how much of the process of building software is coding itself? About 50 years ago, Fred Brooks thought that it is around 15-20% of all time spent. Here are Brooks’ thoughts from The Mythical Man-Month, written in 1975:

“For some years, I have been successfully using the following rule of thumb for scheduling a software task:

⅓ planning

⅙ coding

¼ component test and early system test

¼ system test, all components in hand.”

My take is that today, software engineers probably spend their time like this:

20% planning

40% coding (code + tests)

20% code review (others' code)

20% production readiness + rollout + small fixes during this + monitoring+alerting

At the same time, building standout software has a lot of other parts:

What: Figure out what to build. This can involve brainstorming, designing, user testing, working with product managers and business stakeholders, and so on. For startups, this phase can take very little time (“just build it and see if it works!”). For established companies, it can take up more time than building, though (“we need to make sure what we build doesn’t confuse our existing customers!”).

How: Draw up a plan on how to build the product/feature/service. Think through architecture implications, dependencies, how to test the product, and so on. Again, startups might be able to skip this stage, and the team can jump straight to planning. But for larger companies with more services and dependencies, leaving out planning will come back to bite the team. So most teams are doing some kind of planning using Design docs, RFCs, or ADRs.

Build: Implement the feature or product: write the code, and make sure it works.

Verify: Double check that it works as expected before shipping to production. This is especially important in cases where shipping is high-stakes: for example, shipping a regression to a banking app could have financial implications for customers, and the business! We went into details about QA in QA across the tech industry.

Ship it: Merge the change, and ship to customers. There are plenty of strategies to ship changes to production. We covered several of these in Shipping to production.

Monitoring and oncall: Detect when something is wrong with the product. If there’s an outage, resolve it as soon as possible, and then make sure a similar outage won’t happen again. We looked at these common approaches in Healthy oncall practices and in Incident review and postmortem best practices.

Maintain: Listen to customer complaints and feedback, and decide which bugs warrant fixing, and which are feature requests to prioritize. And figure out what feedback to disregard.

Migrate: If the product goes under large changes, or if the tech stack sees major changes — like a new framework — there might need to be migrations. We covered more in Migrations done well.

AI tools today can help a lot with the “Build” part. But here is a good question: Just how useful are they for the other 7 things that are also part of software engineering?

Needing no developers: the dream since the 1960s

Non-technical people creating working software without needing to rely on software developers has been the dream since the 1960s. Coding is about translating from what people want (the customers, business stakeholders, the product manager, and so on) to what the computer understands. LLMs offer us a higher level of abstraction where we can turn English into code. However, this new abstraction does not change the nature of how software is created, – and what software is, – which is this:

How software is created (and what software is —– it’s more than just code!)

GenAI tools don’t change the process, but they do make some of the coding parts more efficient:

How GenAI tools change our work as software engineers

Throughout the history of technology, new innovations promised the ability for business folks to collapse or bypass the “tech” part, and get straight to working software from their high-level prompts. This was the aspiration of:

1960s: the high-level programming language COBOL. COBOL stands for “common, business-oriented language.” The stated goal of this language was to allow business people with no programming background to use it.

1990s: Visual Basic. A programming language meant to have a very low learning curve, plus a visual environment where forms can be created with drag-and-drop.

Late 2010s: The no-code movement. Through templates and visual editing, no-code solutions like Bubble offer a way to build software applications.

Unsurprisingly, several GenAI coding startups aspire for the same goal: to allow anyone to create software, by using the English language. In the past, we have seen success for simpler use cases. For example, these days, there is no coding knowledge needed to create a website: non-technical people can use visual editors and services like Wix.com, Webflow, Ghost or WordPress.

The higher-level the abstraction, the harder it is to specify how exactly the software should work. No-code solutions already ran into this exact limitation. As advisory CTO Alex Hudson writes in his article The no-code delusion:

“The development of these syntaxes has generally run into the problem of expression: once they are simple enough to pick up quickly, they are no longer expressive enough to use in many scenarios. And vice-versa: some languages have the ability to define a custom language within them, called domain-specific languages (DSLs).

Few of these languages have ever been truly successful amongst the development community at large, primarily because they again make things extremely complex.”

For more complex software, it’s hard to see not needing software engineers taking part in planning, building and maintaining software. And the more GenAI lowers the barrier for non-technical people to create software, the more software there will be to maintain.

AI agents: a major promise, but also a big “unknown” for 2025

Two years after the launch of LLMs, many of us have gotten a pretty good handle on how to use them to augment our coding and software engineering work. They are great for prototyping, switching to less-familiar languages, and tasks where you can verify their correctness, and call out hallucinations or incorrect output.

AI agents, on the other hand, are in their infancy. Most of us have not used them extensively. There is only one generally available agent, Devin, at $500/month, and early responses are mixed.

A lot of venture funding will be pouring into this area. We’ll see more AI coding agent tools launch, and the price point will also surely drop as a result. GitHub Copilot is likely to make something like Copilot Workspace (an agentic approach) generally available in 2025. And we’ll probably see products from startups like what Stripe’s former CTO, David Singleton founded (/dev/agents.)

AI agents trade off latency and cost (much longer time spent computing results and running prompts several times, paraphrased by these startups as “thinking”) for accuracy (better results, based on the prompts). There are some good questions about how much accuracy will improve with this latency+cost tradeoff, and what engineering use cases will see significant productivity boost as a result.

Demand for experienced software engineers could increase

Experienced software engineers could be in more demand in the future than they are today. The common theme we’re seeing with AI tooling is how senior-and-above engineers can use these tools more efficiently, as they can “aim” better with them. When you know what “great output” looks like, you can prompt better, stop code generation when it’s getting things wrong, and you can know when to stop prompting and go straight to the source code to fix the code itself.

We will see a lot more code produced with the help of these AI tools, and a lot more people and businesses start building their own solutions. As these solutions hit a level of complexity, it’s a safe bet that many of them will need to bring in professionals as they attempt to tame the complexity: complexity that requires experienced engineers to deal with. Existing tech companies will almost certainly produce more code with AI tools: and they will rely on experienced engineers to deal with the increase of complexity that necessarily follows.

As a software engineer, mastering AI-assisted development will make you more productive, and also more valuable. It’s an exciting time to be working in this field: we’re living through a time of accelerated tooling innovation. It does take time to figure out how to “tame” the current tools in a way that makes you the most productive: so experiment with them!

I hope you’ve found the practical approaches from Addy helpful. For additional pointers, see the issue AI Tooling for Software Engineers in 2024: Reality Check.

Saturday, 04. January 2025

Michael Ruminer

Using LlamaIndex Part 1 — OpenAI

Using LlamaIndex Part 1 — OpenAI I have started to experiment with LlamaIndex for use in Retrieval Augmented Generation (RAG) document parsing and indexing. My results were mixed on the simple page provided. This is part 1, where I make a short post on LlamaIndex with OpenAI as the LLM component. I expect part 2 to be LlamaIndex with Ollama and Llama3–8b as the LLM components. This is a very sho
Using LlamaIndex Part 1 — OpenAI

I have started to experiment with LlamaIndex for use in Retrieval Augmented Generation (RAG) document parsing and indexing. My results were mixed on the simple page provided. This is part 1, where I make a short post on LlamaIndex with OpenAI as the LLM component. I expect part 2 to be LlamaIndex with Ollama and Llama3–8b as the LLM components.

This is a very short chunk of code. I also used the LlamaIndex Parse browser-based tool to see if I received different outputs. As one would expect, I did not. You can access the browser-based tool by opening a LlamaIndex account and choosing the “Parse” tool in your dashboard. You’ll need an account if you plan to use the code I provide and you will also need to generate an API key from your LlamaIndex dashboard. One of the great things about LlamaIndex is that for a paid tool it is generous in its free usage; 1000 credits PER DAY. In “accurate” mode, it is 1 credit per page; in “premium” mode, it is 15 credits per page. For my simple one page example the output between the two did not differ.

First the small snippet of code.

# pip install llama-index-embeddings-openai llama-index-llms-openai
# pip install llama-index-core llama-parse llama-index-readers-file

from llama_parse import LlamaParse
from llama_index.core import VectorStoreIndex
from dotenv import load_dotenv


load_dotenv()

parser = LlamaParse(result_type="markdown", verbose=True)

file_name = "ssi-page-5.pdf"
extra_info = {"file_name": file_name}

with open(f"../../sample_docs/{file_name}", "rb") as file_to_parse:
# must provide extra_info with file_name key when passing file object
documents = parser.load_data(file_to_parse, extra_info=extra_info)
# to manually check the MD output uncomment the below
# print(documents[0].text)

# index the parsed documents
index = VectorStoreIndex.from_documents(documents)

# generate a query engine for the index
query_engine = index.as_query_engine()

# provide the query and output the results
query = "what are the principles of SSI?"
response = query_engine.query(query)
print(response)

You can find this code and a few sample documents, including the document used in this code in my LlamaIndex_Test Github repo with the code specifically under the Test0 folder.

Note that I don’t set an LLM or an embedding model. LlamaIndex uses OpenAI as the default LLM and OpenAI’s text-embedding-ada-002. You will need an OpenAI API key to go along with the LlamaIndex key. My code loads them from the .env to environmental variables and if they are named appropriately those variables will be found by default. Below is a .env example.

OPENAI_API_KEY=YOUR_API_KEY
LLAMA_CLOUD_API_KEY=YOUR_API_KEY

In the code above I am using a single-page PDF, “ssi-page-5.pdf”. It is page 5 of the larger document, “Self-Sovereign Identity A Systematic Review Mapping and Taxonomy.pdf”. If you plan to send LlamaParse a larger document but use the API properties to tell it only to parse a subset of pages from the document keep in mind that LlamaParse starts at page 0. The first time I tried this I had an off-by-one issue because I assumed page 1 of the document was, you know, page 1. It was page 0. This is understandable from a programming standpoint but caught me off guard anyway.

In the example code, I opened a file directly but LlamaIndex provides a directory reader with filters, if you desire to use that instead. The results I got back on the LLM query were spot on as would be expected on a single page of context with a well-outlined section pertinent to my simple query.

You don’t really need the creation of the vector index, query engine and query/response to test out LlamaIndex parsing. Just uncomment line 23 in the above code (line 19 in the repo code) comment out everything below it and get the parsed output.

Premium Mode and Auto Mode and Less than Expected Outcomes

In the code, I didn’t try out premium mode or auto mode. I intend to make a separate post about auto mode. I did try them in the LlamaIndex Parse tool. In both, I expected the image at the top of the page to get output as an image in the “Images” tab of the Parse output tool, but it didn’t.

The image at the top of the page is below as a screen capture.

This was disappointing. I’m not sure why this did not provide the expected outcome.

There you have it. A simple bit of code to parse using LlamaIndex. What makes it different from other parsers I have tried (all open source) is that it spits out the results in markdown, if desired, which is better than the usual plain text I received in other tools. The markdown provides the LLM more context even if in my simple case it was not of value. The other is that in theory, it will better parse images, tables etc., but as I explained I did not get that result. :-( I’ll continue to experiment with it, especially on more complicated pages such as ones that contain a table and in auto mode via code.

You can find part 2 as “AI RAG with LlamaIndex, Local Embedding, and Ollama Llama 3.1 8b”. The example in part 2 uses LlamaParse auto mode.

Friday, 03. January 2025

Mike Jones: self-issued

Proposed Second Candidate Recommendation for Securing Verifiable Credentials using JOSE and COSE

The W3C Verifiable Credentials Working Group published the Snapshot Second Candidate Recommendation of the Securing Verifiable Credentials using JOSE and COSE specification just before the holidays. This was one of five Candidate Recommendation Snapshots published by the working group at the same time, including for the Verifiable Credentials Data Model 2.0, which I’m also an […]

The W3C Verifiable Credentials Working Group published the Snapshot Second Candidate Recommendation of the Securing Verifiable Credentials using JOSE and COSE specification just before the holidays. This was one of five Candidate Recommendation Snapshots published by the working group at the same time, including for the Verifiable Credentials Data Model 2.0, which I’m also an editor of. A W3C Candidate Recommendation Snapshot is intended to become a W3C Candidate Recommendation after required review and approval steps.

As I wrote about the First Candidate Recommendation, VC-JOSE-COSE secures VC Data Model payloads with JOSE, SD-JWT, or COSE signatures. And while I’m admittedly not a fan of JSON-LD, to the extent that Verifiable Credentials using the JSON-LD-based VC Data Model are in use, I’m committed to there being a solid VC-JOSE-COSE specification so there is a simple, secure, standards-based way to sign these credentials.

One significant change since the First Candidate Recommendation was splitting the Controller Document text out into its own specification called Controlled Identifier Document 1.0. Publishing a Candidate Recommendation Snapshot for it is planned for next week. Part of why it became its own specification is so that it can be referenced by the planned update to the W3C DID specification.

Thanks to my co-editor Gabe Cohen and working group chair Brent Zundel for the significant work they both put in to help us reach this point!

Wednesday, 01. January 2025

Just a Theory

Should URI::mysql Switch to DBD::MariaDB?

Should Sqitch and URI::mysql use DBD::MariaDB instead of DBD::mysql? If so, what are the implications for Sqitch deployment and usage?

I seek the wisdom of the Perl Monks:

The Sqitch project got a request to switch from DBD::mysql to DBD::MariaDB. DBD::mysql 5’s requirement to build from the MySQL 8 client library provides the impetus for the request, but in poking around, I found a blogs.perl.org post highlighting some Unicode fixes in DBD::MariaDB, as well.

Now, Sqitch likely doesn’t have the Unicode issue (it always works with Perl Unicode strings), but it depends on URI::db to provide the DBI connection string. For MySQL URIs, the URI::mysql dbi_driver method returns mysql.

Should it be changed to return MariaDB, instead? Is there general community consensus that DBD::MariaDB provides better compatibility with both MySQL and MariaDB these days?

I’m also curious what the impact of this change would be for Sqitch. Presumably, if DBD::MariaDB can build against either the MariaDB or MySQL client library, it is the more flexible choice to continue supporting both databases going forward.

Feedback appreciated via PerlMonks or the Sqitch issue.

Update 2025-01-08

URI-db 0.23 uses DBD::MariaDB instead of DBD::mysql for both URI::mysql and URI::MariaDB.

Similarly, Sqitch v1.5.0 always uses DBD::MariaDB when connecting to MySQL or MariaDB, even when using older versions of URI::db. Thanks everyone for the feedback and suggestions!

More about… Perl Sqitch URI::db MySQL MariaDB

New JSONPath Feature: SelectLocated

New in the jsonpath Go package and Playground: “Located” results that pair selected values with normalized paths to their locations.

Happy New Year! 🎉🥳🍾🥂

The JSONPath RFC includes a section on defining normalized paths, which use a subset of JSONPath syntax to define paths to the location of a node in a JSON value. I hadn’t thought much about it, but noticed that the serde JSONPath Sandbox provides a “Located” switch adds them to query results. For the sake of complementarity, I added the same feature to the Go JSONPath Playground.

🛝 See it in action with this example, where instead of the default output:

[ 8.95, 12.99, 8.99, 22.99, 399 ]

The located result is:

[ { "node": 8.95, "path": "$['store']['book'][0]['price']" }, { "node": 12.99, "path": "$['store']['book'][1]['price']" }, { "node": 8.99, "path": "$['store']['book'][2]['price']" }, { "node": 22.99, "path": "$['store']['book'][3]['price']" }, { "node": 399, "path": "$['store']['bicycle']['price']" } ]

v0.3.0 of the github.com/theory/jsonpath Go package enables this feature via its new SelectLocated method, which returns a LocatedNodeList that shows off a few of the benfits of pairing JSONPath query results with paths that uniquely identify their locations in a JSON value, including sorting and deduplication. It also takes advantage of Go v1.23 iterators, providing methods to range over all the results, just the node values, and just the paths. As a result, v0.3.0 now requires Go 1.23.

The serde_json_path Rust crate inspired the use of LocatedNodeList rather than a simple slice of LocatedNode structs, but I truly embraced it once I noticed the the focus on “nodelists” in the RFC’s overview, which provides this definition:

A JSONPath expression is a string that, when applied to a JSON value (the query argument), selects zero or more nodes of the argument and outputs these nodes as a nodelist.

It regularly refers to nodelists thereafter, and it seemed useful to have an object to which more features can be added in the future. github.com/theory/jsonpath v0.3.0 thererfore also changes the result value of Select from []any to the new NodeList struct, an alias for []any. For now it adds a single method, All, which again relies on Go v1.23 iterators to iterate over selected nodes.

While the data type has changed, usage otherwise has not. One can iterate directly over values just as before:

for _, val := range path.Select(jsonInput) { fmt.Printf("%v\n", val) }

But All removes the need to alias-away the index value with _:

for val := range path.Select(jsonInput).All() { fmt.Printf("%v\n", val) }

I don’t expect any further incompatible changes to the main jsonpath module, but adding these return values now allows new features to be added to the selected node lists in the future.

May you find it useful!

More about… Go JSONPath Playground RFC 9535

Wrench in the Gears

Wishing All A Resonant New Year – May We Have The Blessings Of New Learnings And Chances To Grow In 2025

We don’t always get to choose what happens during the course of our lives, but we DO get to choose how we incorporate the unexpected into our stories. Here’s to seeing the value of complementary energies that keep this “reality” going even during times when the mixture makes us queasy. May we learn to navigate [...]

We don’t always get to choose what happens during the course of our lives, but we DO get to choose how we incorporate the unexpected into our stories. Here’s to seeing the value of complementary energies that keep this “reality” going even during times when the mixture makes us queasy. May we learn to navigate complex landscapes layered with meaning by remembering to be still, open, without knee jerk judgement. Quantum superposition is full of potential, if only we can learn how to engage with it. When we get stuck, try to remember there are other angles, different scales, alternative time structures. I believe the divine cosmos of which we are an integral part is beneficent, responsive, and playful. In an age of apocalyptic anxiety, may we stay in touch with our inner children who can create many somethings from an apparent “nothing.” Tap the inside of the doughnut, torus fields pulsing with heart energy for all beings, keeping collective time for this beautiful Earth that holds our bodies and souls for now.

New Year’s Greetings from Hot Springs, Arkansas!

PS: Joseph has a different operating system than I do, but it seems we keep arriving in roughly the same place. Be sure to check out his posts on the algorithmic “circle of values” and ideas about quantum realism, information theory, and death.

PPS: I highly recommend the Adrian Tchaikovsky “Children of Time” trilogy. This is a link to a PDF I made this morning of relevant sections pertaining to crystalline communication with the perceived “divine,” Lamarckian epigenetics, complex system evolution, heresy, ant computers, and encrypted chemical engineering.

A half-hour update and end-of-2024 thoughts below.

 


IdM Laboratory

Intention Economyその後

こんにちは、富士榮です。 年末にDoc SearlsがIntention Economyについて「The Real Intention Economy」というポストをしています。かなり重要なポストだと思うので読んでおいた方が良さそうです。 https://doc.searls.com/2024/12/30/the-real-intention-economy/ 彼の著書は日本語にも翻訳されていますね。 https://amzn.to/3C2PX9n さて、今回のDocのポストに戻ると、彼がIntention Economyの考え方を発表してからもう直ぐ20年が経とうとしている現在、生成AIの文脈も相まって、Intention Economy自体が脅威となりつつある、という話です。 Intention Economyで検索すると結構ヤバ目の結果が返
こんにちは、富士榮です。
年末にDoc SearlsがIntention Economyについて「The Real Intention Economy」というポストをしています。かなり重要なポストだと思うので読んでおいた方が良さそうです。
https://doc.searls.com/2024/12/30/the-real-intention-economy/
彼の著書は日本語にも翻訳されていますね。 https://amzn.to/3C2PX9n

さて、今回のDocのポストに戻ると、彼がIntention Economyの考え方を発表してからもう直ぐ20年が経とうとしている現在、生成AIの文脈も相まって、Intention Economy自体が脅威となりつつある、という話です。
Intention Economyで検索すると結構ヤバ目の結果が返ってくるようになっているとのこと。 要するにIntention Economyというキーワードが悪用されつつある、ということですね。
こんなことも書かれていると言っています。
The near future could see AI assistants that forecast and influence our decision-making at an early stage, and sell these developing “intentions” in real-time to companies that can meet the need – even before we have made up our minds.

近い将来、AI アシスタントが早い段階で私たちの意思決定を予測して影響を与え、私たちが決断を下す前であっても、その発展中の「意図」をニーズを満たすことができる企業にリアルタイムで販売するようになるかもしれません。

同じくこんな引用もされています。
The rapid proliferation of large language models (LLMs) invites the possibility of a new marketplace for behavioral and psychological data that signals intent.

大規模言語モデル (LLM) の急速な普及により、意図を示す行動および心理データの新しい市場が生まれる可能性が生まれています。


もともと顧客の関心(Attention)を商品として販売するというモデルに対するアンチテーゼの文脈としての意図(Intention)を中心とした経済としてIntention Economyだったはずですが、その意図自体を商品として販売する、という市場が形成されてきつつあるということですね。

人間の欲望は果てしないわけですが、私たちは思想の源流をきちんと見据え、意図を理解した上で社会実装を進めたいものです。 

 


Tuesday, 31. December 2024

Just a Theory

SQL/JSON Path Playground Update

The Go SQL/JSON Playground has been updated with a fresh design and PostgreSQL 17 compatibility.

Based on the recently-released Go JSONPath and JSONTree playgrounds, I’ve updated the design and of the SQL/JSON Playground. It now comes populated with sample JSON borrowed from RFC 9535, as well as a selection of queries that randomly populate the query field on each reload. I believe this makes the playground nicer to start using, not to mention more pleasing to the eye.

The playground has also been updated to use the recently-released sqljson/path v0.2 package, which replicates a few changes included in the PostgreSQL 17 release. Notably, the .string() function no longer uses a time zone or variable format to for dates and times.

Curious to see it in action? Check it out!

More about… Postgres SQL/JSON JSON Path Go Playground

Werdmüller on Medium

Tintin and the fascists

The boy reporter has a fascist origin. But something new is possible. Continue reading on Medium »

The boy reporter has a fascist origin. But something new is possible.

Continue reading on Medium »


Predictions for tech, 2025

It’s going to be a hard year. Continue reading on Medium »

It’s going to be a hard year.

Continue reading on Medium »

Monday, 30. December 2024

IdM Laboratory

366/366 !!!

こんにちは、富士榮です。 ついにこの日が来ました。 去年の正月休みに某猫とのチキンレースが始まってしまったので収まりがつかなくなって惰性で描き続けていましたが気がついたら本当に1年経ってしまいました。 ↓某猫のポスト ふじえさん、やっぱりやっとるな! 毎日記事を書く366チャレンジや! これ絶対やっとるぞ! — 👹秋田の猫🐱 (@ritou) January 4, 2024 最初のうちは割と実装してみよう!的なポストが多かったのですが、中盤〜後半は忙しくなりすぎたこともあり読んでみようシリーズが大半を占めてしまったのは反省です。 ということで振り返ってみましょう。 1月のポストはこんな感じです。 この頃は結構作ってますね。まぁ、冬休みが暇だったので実装し始めたのがきっかけだったので。 あとは1月はOpenID Summit Tokyo

こんにちは、富士榮です。

ついにこの日が来ました。



去年の正月休みに某猫とのチキンレースが始まってしまったので収まりがつかなくなって惰性で描き続けていましたが気がついたら本当に1年経ってしまいました。

↓某猫のポスト

ふじえさん、やっぱりやっとるな!
毎日記事を書く366チャレンジや!
これ絶対やっとるぞ!

— 👹秋田の猫🐱 (@ritou) January 4, 2024


最初のうちは割と実装してみよう!的なポストが多かったのですが、中盤〜後半は忙しくなりすぎたこともあり読んでみようシリーズが大半を占めてしまったのは反省です。

ということで振り返ってみましょう。

1月のポストはこんな感じです。


この頃は結構作ってますね。まぁ、冬休みが暇だったので実装し始めたのがきっかけだったので。

あとは1月はOpenID Summit Tokyoもありましたね。2024年の後半にかけて現在も活動が続いているSIDI Hubを日本で開催する調整も実はこの時期から始まっていました。


次に2月です。この辺りでそういえば今年は366日やん、と思って他の年よりも1日不利!!!ということに気がついた感じです。


まだ実装は続けていますね。OpenID Providerが一段落したのでパスキーに手を出し始めています。やっぱり手を動かさないとわからないことも多いなぁ、と実感した時期でもありました。


3月です。


まだ実装も続けいますが、色々とニュースも紹介し始めているのと、普段考えていることなんかもポストし始めていますね。結果、ポストを読んでくれた人たちと議論することもできたので非常に勉強になりました。


4月です。


2月ごろにデジタル庁の認証アプリについても色々と調べたり考えたりしていましたが、結果メディアの方々からもインタビューいただいたりもして、各種社会実装について深く考えた時期でもありました。個人的には新年度も重なったことで結構忙しかった記憶しかありません・・・


5月です。


4月〜6月はイベントも多かったので感想を書いていたのと、ちょうどNIST SP800-63-3の同期可能クレデンシャルに関する追補版が出た時期でしたね。

色々と読むものが多かった気がします。


6月です。


EICがあったので参加していましたね。来年もいかないと。。。

他にも色々なドキュメントが公開されたので読み込む系のポストが増えてきていますね。


7月です。

折り返し地点です。


そういえばこの時期にDIF Japanのキックオフがあったんですね。他にもDID/VCに関する論文を公開したりもしました。色々と暑い時期でした。


8月です。


パスキーに関する議論が色々とあった時期なので日本語にした公開したりしましたね。パスキー、まだまだ完全に普及した、という状態ではないので引き続き様子は見ていきたいと思います。

この時期はトラスト、とか本人確認や身元確認へのデジタルクレデンシャルの利用について割と真剣に考え始めている時期だったのでそういうニュアンスのポストもしていますね。まだまだ適当な実装が多いこの世の中なので、みんな真剣に考えていけるといいですね。


9月です。


SIDI HubワシントンDC会合もありましたし、ベルリンやケープタウンのレポートが公開された時期でもあったのでSIDI Hub三昧でした。他にもついにパンドラの箱を開けたAuthZEN WGが本格的に活動を始めた時期だったのでAuthorization APIもウォッチし始めた時期ですね。


10月です。


10月末に東京でSIDI Hub Summitを開催したので、その準備でかなり忙しかった時期です。月末〜月初はIIW〜IETFもありましたし。

国際イベントのハンドリングや準備は何度やっても良い経験になりますね。しんどいけど。


11月です。


リンク可能性の話はまだ解けていない課題の中でも議論がつきない話です。IIWでも何年も話題になっていますし、IETFのメーリングリストでも議論が何度も行われています。


12月です。ついに終わります。


台湾政府に呼ばれてWalletの話をしに行ったりもしましたし、今まさに読んでいるAAMVAのガイドラインが11月末に更新されたことを受け、読んでいきました。



ということであっという間に1年が経ってしまいました。


で、来年はどうするの?という話ですが、まぁ習慣化してしまったところなので今後も無理しない程度に書いていこうとは思いますが、適度に休む必要性も同時に感じているので毎日は描かないかなぁ、と思います。クォリティも落ちますしね。


ということでみなさん、良いお年を!






Patrick Breyer

Soll ich der elektronischen Patientenakte widersprechen und wie geht das?

Wenn du gesetzlich krankenversichert bist, wird deine Krankenkasse bis 15. Januar 2025 eine elektronische Patientenakte für dich einrichten – es sei denn, du widersprichst. Das gilt auch für private Krankenversicherungen, die …
Wenn du gesetzlich krankenversichert bist, wird deine Krankenkasse bis 15. Januar 2025 eine elektronische Patientenakte für dich einrichten – es sei denn, du widersprichst. Das gilt auch für private Krankenversicherungen, die freiwillig eine elektronische Patientenakte anbieten. Durch einen Widerspruch dürfen dir keine Nachteile entstehen. Die Anschreiben der Krankenkassen informieren unzureichend. Bedenke bei deiner Entscheidung:

1. Inhalt und Datensicherheit: Ohne deinen Widerspruch wird künftig jeder Arztbesuch, jede Therapie und jede Medikation in einer zentral abrufbaren elektronischen Patientenakte gespeichert, darunter Arztbriefe, Entlassungsberichte und Diagnosekennungen. Hierunter fallen unter anderem auch Informationen zu deinen Suchtkrankheiten, Psychotherapien und psychischen Erkrankungen, Schwangerschaftsabbrüchen, Geschlechtskrankheiten und Reproduktionsstörungen (z.B. Impotenz), die gespeichert und zusammengetragen werden. Durch die zentrale Speicherung aller Behandlungen in einer elektronischen Patientenakte erhältst du eine einheitliche Zugriffsmöglichkeit auf deine Patientendaten. Andererseits waren in der Vergangenheit bereits mehrere zentralisierte, europäischen Gesundheitssysteme von umfangreichen Datenpannen betroffen. Bei der elektronischen Patientenakte in Deutschland ist noch nichts passiert. Sicherheitsforscher warnen jedoch, Kriminelle könnten mit wenig Aufwand auf beliebige Patientenakten zugreifen. Falls es zu Hacking, Datendiebstahl oder Datenverlust kommen sollte, können viel mehr Informationen verloren gehen als ohne elektronische Patientenakte.

2. Direkter Zugriff: Mit der elektronischen Patientenakte können behandelnde Ärzte, deine Apotheker, deine Physiotherapeuten, dein Pflegeheim, usw., sowohl in Deutschland als auch künftig im EU-Ausland deine elektronische Patientenakte direkt einsehen. Sie können sie auch in ihrer eigenen Praxissoftware speichern. Die Vorteile: Der Datenzugriff kann deine Behandlungen und Medikation wesentlich beschleunigen und verbessern, z. B. auch fehlerhafte Medikamentationen vermeiden. Unterlagen müssen nicht erst angefordert werden und liegen auch im Notfall unmittelbar vor. Doppeluntersuchungen entfallen. Ein möglicher Nachteil könnte sich daraus ergeben, dass für andere medizinische Stellen nicht relevante Behandlungen, Medikamentationen, etc. abgerufen und somit eingesehen werden können (z.B. dein Zahnarzt oder Apotheker könnte von deiner Psychotherapie oder Geschlechtskrankheit erfahren).

Ohne elektronische Patientenakte wirst du, so wie du es bisher kennst und möglicherweise erlebt hast, auf einem Anamnesebogen nach Vorbehandlungen, Medikamentationen, Unverträglichkeiten, Allergien, usw. gefragt. Gegebenenfalls werden weitere Unterlagen angefordert wie z. B. MRT-Aufnahmen, usw. Es gibt auch einen papierbasierten, bundeseinheitlichen Medikationsplan, den du dir ausdrucken lassen kannst. Dieser ist aber auch in der elektronischen Patientenakte speicherbar.

Mit elektronischer Patientenakte kannst du mündlich, über eine App oder über deine Krankenkasse einzelne (ärztliche) Leistungserbringer komplett vom Zugriff auf deine elektronische Patientenakte ausschließen. Dies ist mit gewissen Herausforderungen verbunden. Du hast auch das Recht, gegenüber Leistungserbringern der Übertragung von einzelnen Informationen in die elektronische Patientenakte zu widersprechen. Du kannst einzelne Dokumente auch nachträglich wieder aus ihrer elektronischen Patientenakte löschen (lassen) – leider wieder recht kompliziert. Bedenke, dass Löschungen Nachteile für deine Behandlung nach sich ziehen können.

3. Weitergabe an Dritte: Deine elektronische Patientenakte kann für bestimmte Zwecke auch an Politik und Behörden, Forschende und Wirtschaftsunternehmen (z.B. Technologieunternehmen und Pharmaindustrie) weitergegeben werden. Die Chancen: Anonymisierte Gesundheitsdatenanalysen ermöglichen bedeutende Fortschritte in der Pandemie- und Endemieeindämmung, z. B. durch Früherkennung, sowie in der Forschung, z.B. bei der Entwicklung sicherer Medikamente. Die Risiken: Nur teilweise werden Daten vor der Weitergabe anonymisiert. In anderen Fällen werden dein Name und deine Versichertennummer zwar entfernt, aber über andere Kennzeichen bleibst du eindeutig identifizierbar (sog. Pseudonymisierung). So kann es Dritten gelingen, deine Krankheitsgeschichte wieder dir zuzuordnen.

Hinweis: Auch wenn du die elektronische Patientenakte nutzt, kannst du ihrer Weitergabe an Dritte frühestens ab dem 15.01.2025 separat widersprechen (über eine App deiner Krankenkasse oder durch Brief an ihre Ombudsstelle).

Ergebnis: Es ist letztendlich deine Entscheidung, ob du eine elektronische Patientenakte haben möchtest oder nicht.

Wenn du eine elektronische Patientenakte geführt haben möchtest, brauchst du nichts zu tun – sie wird ab 15. Januar 2025 automatisch eingerichtet.

Wenn du dich gegen eine elektronische Patientenakte entscheidest, hilft dir der Widerspruchs-Generator ein Widerspruchsschreiben zu erstellen. Du kannst auch per Krankenkassen-App oder – bei einigen Krankenkassen – auch über deren Website widersprechen.

Du kannst ab dem Moment widersprechen, in dem deine Krankenkasse dich über dein Widerspruchsrecht informiert hat. Ein Widerspruch ist auch nach Ablauf der in dem Schreiben genannten 6-Wochen-Frist jederzeit möglich; dann werden alle bereits gespeicherte Daten wieder aus der elektronischen Patientenakte gelöscht. Solltest du später einmal deine Entscheidung ändern, kannst du deinen Widerspruch jederzeit wieder zurücknehmen und für die Zukunft eine elektronische Patientenakte bekommen.

Weitere Informationen des GKV-Spitzenverbands (mit der Bundesdatenschutzbeauftragten abgestimmt)

Wie die Piraten einen Zwang zur elektronischen Patientenakte verhindert haben

Sunday, 29. December 2024

IdM Laboratory

AAMVAのMobile Drivers License Implementation Guidelinesを読む⑧

 こんにちは、富士榮です。 引き続きAAMVAのMobile Drivers License Implementation Guidelines 1.4を読んでいきます。 まだまだプライバシーの章が続きます。 4.5. DELETING MDL INFORMATION FROM A DEVICE An mDL holder must have the capability to delete the mDL holder’s mDL from the mDL holder’s device. Such deletion: Must delete all mDL information, log information, and any metadata (e.g. settings) that could impart information

 こんにちは、富士榮です。

引き続きAAMVAのMobile Drivers License Implementation Guidelines 1.4を読んでいきます。


まだまだプライバシーの章が続きます。

4.5. DELETING MDL INFORMATION FROM A DEVICE

An mDL holder must have the capability to delete the mDL holder’s mDL from the mDL holder’s device. Such deletion:

Must delete all mDL information, log information, and any metadata (e.g. settings) that could impart information about the deleted mDL or its use.  Must not require approval by the Issuing Authority. Must be an option available to an mDL holder on the mDL device Must be possible when the mDL device is offline. Should be available to an mDL holder via a request to the Issuing Authority (see below).

mDL保持者は、mDL保持者のデバイスからmDL保持者のmDLを削除する機能を持たなければならない。

すべてのmDL情報、ログ情報、および削除されたmDLまたはその使用に関する情報を与える可能性のあるメタデータ(設定など)を削除すること 発行機関の承認を必要としないこと。 mDLデバイス上でmDL保持者が利用可能なオプションであること。 mDLデバイスがオフラインのときに可能であること。 発行機関(下記参照)へのリクエストにより、mDL保持者が利用可能であること。

 デバイスからmDL情報を削除する際の話です。基本的に利用者が自身で削除をすることができること(その際に発行者の承認や接続が不要であること)が求められています。難しいところですね。この章の中で発行したmDL関連情報が適切に扱われていること発行機関が責任をもって確認することが求められる一方で利用者の権利も守らないといけないわけです。まぁ、最低限ウォレット開発者が悪意を持って利用者のデータを扱えないように、というところまでは守りましょう、ってところですね。

Should an mDL device (i.e. a device containing an mDL) be lost or get stolen, it could be beneficial for the mDL holder to have the mDL remotely deleted (or temporarily suspended) by the Issuing Authority. Besides the obvious advantage to the mDL holder, other considerations apply too:

The mDL holder’s request must be authenticated. It must not be possible for someone other than the mDL holder or the Issuing Authority to delete (or suspend) an mDL. A “push” capability (from the Issuing Authority to the mDL device) is needed for immediate deletion (or suspension) (see section 6). Successful deletion (or suspension) depends on network connectivity to the mDL device The mDL will automatically become unusable (although potentially not inaccessible) when the MSO expires (see section 6). 

mDLデバイス(mDLを含むデバイス)が紛失または盗難に遭った場合、発行機関によってmDLがリモートで削除(または一時的に停止)されることは、mDL保有者にとって有益です。mDL保有者にとっての明らかな利点の他に、他の考慮事項も適用されます:

mDL保有者の要求は認証されなければならない。mDL保持者の要求は認証されなければならない。mDL保持者または発行機関以外の者がmDLを削除(または一時停止)することはできない。 即時削除(または一時停止)には、(発行局からmDLデバイスへの)「プッシュ」機能が必要である(セクション6参照) 削除(または一時停止)の成功は、mDLデバイスへのネットワーク接続に依存します。 MSOの有効期限が切れると、mDLは自動的に使用できなくなる(アクセスできなくなる可能性はないが)(セクション6参照)。

やはりスマートフォンベースの話なので当然紛失や盗難に関する考慮は十分に