Last Update 10:43 AM September 19, 2021 (UTC)

Identity Blog Catcher

Brought to you by Identity Woman and Infominer.
Support this collaboration on Patreon!!!

Sunday, 19. September 2021

Simon Willison

Weeknotes number 100

This entry marks my 100th weeknotes, which I've managed to post once a week (plus or minus a few days) consistently since 13th September 2019. I started writing weeknotes to add some accountability to the work I was doing during my JSK fellowship year at Stanford. The fellowship ended over a year ago but I've stuck to the habit - I've been finding it really helpful as a structured approach to th

This entry marks my 100th weeknotes, which I've managed to post once a week (plus or minus a few days) consistently since 13th September 2019.

I started writing weeknotes to add some accountability to the work I was doing during my JSK fellowship year at Stanford. The fellowship ended over a year ago but I've stuck to the habit - I've been finding it really helpful as a structured approach to thinking about my work every week, and it occasionally helps motivate me to get things done enough that I have something I can write about!

Datasette Desktop 0.2.0

My big achievement this week was Datasette Desktop 0.2.0 (and the 0.2.1 patch release that followed). I published annotated release notes for that a few days ago. I'm really pleased with the release - I think Datasette as a desktop application is going to significantly increase the impact of the project.

I also sent out an issue of the Datasette Newsletter promoting the new desktop application.

Datasette Desktop for Windows

I did a quick research spike to investigate the feasibility of publishing a Windows version of Datasette Desktop. To my surprise, I managed to get a working prototype going with just half a small amount of work:

So that was one heck of a lot easier than I expected... pic.twitter.com/BDa4gvkgnd

- Simon Willison (@simonw) September 15, 2021

Electron claims to solve cross-platform development and it seems to uphold that claim pretty well!

I'm still quite a bit of work away from having a release: I need to learn how to build and sign Windows installers. But this is a very promising first step.

json-flatten

I've started thinking about how I can enable Datasette Desktop users to configure plugins without having to hand-edit plugin configuration JSON (the current mechanism).

This made me take another look at a small library I released a couple of years ago, json-flatten, which turns a nested JSON object into a set of flat key/value pairs suitable for editing using an HTML form and then unflattens that data later on.

>>> import json_flatten >>> json_flatten.flatten({"foo": {"bar": [1, True, None]}}) {'foo.bar.[0]$int': '1', 'foo.bar.[1]$bool': 'True', 'foo.bar.[2]$none': 'None'} >>> json_flatten.unflatten(_) {'foo': {'bar': [1, True, None]}}

It turns out a few people have been using the library, and had filed issues - I released version 0.2 with a couple of fixes.

TIL this week Cumulative total over time in SQL Configuring auto-update for an Electron app Releases this week datasette-statistics: 0.1.1 - (2 releases total) - 2021-09-16
SQL statistics functions for Datasette json-flatten: 0.2 - 2021-09-14
Python functions for flattening a JSON object to a single dictionary of pairs, and unflattening that dictionary back to a JSON object datasette-app: 0.2.1 - (3 releases total) - 2021-09-13
The Datasette macOS application datasette-app-support: 0.11.5 - (18 releases total) - 2021-09-13
Part of https://github.com/simonw/datasette-app datasette-write: 0.2 - (3 releases total) - 2021-09-11
Datasette plugin providing a UI for executing SQL writes against the database datasette-schema-versions: 0.2 - (2 releases total) - 2021-09-11
Datasette plugin that shows the schema version of every attached database datasette-import-table: 0.3 - (6 releases total) - 2021-09-08
Datasette plugin for importing tables from other Datasette instances

Saturday, 18. September 2021

John Philpin : Lifestream

If I have to click on more picture of a ‘traffic light’ - I

If I have to click on more picture of a ‘traffic light’ - I am going to throw this machine through the window!

If I have to click on more picture of a ‘traffic light’ - I am going to throw this machine through the window!


The Modern Complexity of the Simple Blog. Somebody needs

The Modern Complexity of the Simple Blog. Somebody needs to let them know about Micro Blog, Dave Winer and Blot. Or maybe Macdrifter need to either; 1) do better research, or 2) clarify the problem they are trying to resolve.

The Modern Complexity of the Simple Blog.

Somebody needs to let them know about Micro Blog, Dave Winer and Blot.

Or maybe Macdrifter need to either;

1) do better research, or

2) clarify the problem they are trying to resolve.


Ben Werdmüller

Constantly hungry for constructive criticism, and particularly ...

Constantly hungry for constructive criticism, and particularly hungry for guidance during this moment of my life. What could I be doing better? But asking for this is an imposition and puts people in a really uncomfortable position. Doing my best.

Constantly hungry for constructive criticism, and particularly hungry for guidance during this moment of my life. What could I be doing better? But asking for this is an imposition and puts people in a really uncomfortable position. Doing my best.


A novel is born

I’m obsessed with this kind of thing: an author self-publishing across various media using the internet as a kind of canvas. So obsessed that I really want to try it myself. [Link]

I’m obsessed with this kind of thing: an author self-publishing across various media using the internet as a kind of canvas. So obsessed that I really want to try it myself.

[Link]


Here's why California has the lowest COVID rate in the nation

““If the small inconvenience of wearing a mask could protect my neighbor, I wear one with a smile,” he said. “Similarly, if the science, my own self-interest and the protection of my neighbors all are promoted by getting a vaccine, I’m happy to join my neighbors in line.”” Thank you, California. [Link]

““If the small inconvenience of wearing a mask could protect my neighbor, I wear one with a smile,” he said. “Similarly, if the science, my own self-interest and the protection of my neighbors all are promoted by getting a vaccine, I’m happy to join my neighbors in line.”” Thank you, California.

[Link]


John Philpin : Lifestream

That moment when you create a new vault in Obsidian by movin

That moment when you create a new vault in Obsidian by moving a folder structure out of the current Obsidian vault, jump over to your new vault and all the extensions and css now needs to be set up from scratch …. GAAAH !!! Or is it … read on >>>

That moment when you create a new vault in Obsidian by moving a folder structure out of the current Obsidian vault, jump over to your new vault and all the extensions and css now needs to be set up from scratch ….

GAAAH !!!

Or is it … read on >>>


Ben Werdmüller

I’m curious how many investors use the ...

I’m curious how many investors use the Peter Thiel Roth loophole to invest - and how many are scrambling now it looks like it might be going away?

I’m curious how many investors use the Peter Thiel Roth loophole to invest - and how many are scrambling now it looks like it might be going away?

Friday, 17. September 2021

John Philpin : Lifestream

Paging Kiwi Peeps … Muriel Newman ex ACT now runs a ‘thin

Paging Kiwi Peeps … Muriel Newman ex ACT now runs a ‘thinktank’ - anyone know much about her? Should I be paying attention or ignoring her?

Paging Kiwi Peeps …

Muriel Newman ex ACT now runs a ‘thinktank’ - anyone know much about her?

Should I be paying attention or ignoring her?


Ben Werdmüller

You’ll know if something is really democratizing ...

You’ll know if something is really democratizing finance if it works as or more beneficially for people with low balances. What’s the threshold at which the gains outweigh the fees (including gas)?

You’ll know if something is really democratizing finance if it works as or more beneficially for people with low balances. What’s the threshold at which the gains outweigh the fees (including gas)?


John Philpin : Lifestream

Interesting to see newly released Drafts 28.0 includes abili

Interesting to see newly released Drafts 28.0 includes ability to publish directly into an Obsidian folder. But there’s more : Blot is also looking at the ability to be a web front end to an Obsidian vault. Right now it works - but links break, but we are moving forward.

Interesting to see newly released Drafts 28.0 includes ability to publish directly into an Obsidian folder.

But there’s more :

Blot is also looking at the ability to be a web front end to an Obsidian vault. Right now it works - but links break, but we are moving forward.


Read On The Internet : “Real chaos makes no noise.”

Read On The Internet : “Real chaos makes no noise.”

Read On The Internet :

“Real chaos makes no noise.”


Bill Wendel's Real Estate Cafe

Real estate is still Broken — Time for mass movement to fix it?

As a long-time real estate consumer advocate, it’s obvious that the pandemic has exposed Peak Real Estate Dysfunction and calls for change are coming from inside and… The post Real estate is still Broken — Time for mass movement to fix it? first appeared on Real Estate Cafe.

As a long-time real estate consumer advocate, it’s obvious that the pandemic has exposed Peak Real Estate Dysfunction and calls for change are coming from inside and…

The post Real estate is still Broken — Time for mass movement to fix it? first appeared on Real Estate Cafe.


Ben Werdmüller

There are Gen Z venture capitalists, and ...

There are Gen Z venture capitalists, and I am as old as dust.

There are Gen Z venture capitalists, and I am as old as dust.


Apple and Google Remove ‘Navalny’ Voting App in Russia

"Google removed the app Friday morning after the Russian authorities issued a direct threat of criminal prosecution against the company’s staff in the country, naming specific individuals, according to a person familiar with the company’s decision. The move comes one day after a Russian lawmaker raised the prospect of retribution against employees of the two technology compani

"Google removed the app Friday morning after the Russian authorities issued a direct threat of criminal prosecution against the company’s staff in the country, naming specific individuals, according to a person familiar with the company’s decision. The move comes one day after a Russian lawmaker raised the prospect of retribution against employees of the two technology companies, saying they would be “punished.”" This wouldn't be an issue if the app store wasn't hopelessly centralized.

[Link]


Fairness Friday: People’s Programs

I’m posting Fairness Fridays: a new community social justice organization each week. I donate to each featured organization. If you feel so inclined, please join me. This week I’m donating to People’s Programs. Based in Oakland, People's Programs is a grassroots community organization that serves the people of Oakland and is dedicated to “the unification and liberation of Afrikans across the

I’m posting Fairness Fridays: a new community social justice organization each week. I donate to each featured organization. If you feel so inclined, please join me.

This week I’m donating to People’s Programs. Based in Oakland, People's Programs is a grassroots community organization that serves the people of Oakland and is dedicated to “the unification and liberation of Afrikans across the diaspora”.

Its programs include People’s Breakfast, a free breakfast program for Oakland’s houseless community, a health clinic, bail and legal support, a grocery program, and more. Modern inequality and generational injustices mean that organizations like People’s Programs are crucial lifelines for many people.

I donated. If you have the means, I encourage you to join me here. I also donated a tent from their tent drive wishlist.


Troll farms reached 140 million Americans a month on Facebook before 2020 election

"The report reveals the alarming state of affairs in which Facebook leadership left the platform for years, despite repeated public promises to aggressively tackle foreign-based election interference. MIT Technology Review is making the full report available, with employee names redacted, because it is in the public interest." [Link]

"The report reveals the alarming state of affairs in which Facebook leadership left the platform for years, despite repeated public promises to aggressively tackle foreign-based election interference. MIT Technology Review is making the full report available, with employee names redacted, because it is in the public interest."

[Link]


Restore point

Not too long after I wrote my blog post about cars, my car was broken into. Unfortunately, I'd made the unwise decision to leave my backpack in the boot, with all of my devices save my phone. They were swiped unceremoniously. I feel pretty stupid about it: never leave your valuables in your car in a public place. Particularly not valuables you use for work. But beyond that, I have a few observ

Not too long after I wrote my blog post about cars, my car was broken into. Unfortunately, I'd made the unwise decision to leave my backpack in the boot, with all of my devices save my phone. They were swiped unceremoniously.

I feel pretty stupid about it: never leave your valuables in your car in a public place. Particularly not valuables you use for work.

But beyond that, I have a few observations about the cloud. Because less than 24 hours later, I'm completely back up and running again on new devices that have all the data, configurations, and feel of my old ones.

First of all, here's what Find My says about the ones that were stolen:

The headphones and the iPad pinged first, and then my laptop pinged about a minute later. You can see the thief progress north. Find My is pretty good at pinging through any available connection - that's why AirTags work - but the trail runs cold from there. Out of an abundance of caution, I marked the iPad and laptop as locked and left a message in case anyone tries to turn them on. (Unfortunately you can't lock the AirPods.)

This morning I set up a new laptop, and within an hour I had all my apps and files back. It's the same model as the old one, so it's in effect identical, except without all the cool stickers. I'm hopeful that my property insurance will help me pay for the replacement.

I've been backing up on iCloud for a while, and although I have some real worries about some of the direction that Apple's going in (the shelved plan to scan devices is, despite the obviously good intentions, deeply problematic), I'm relatively comfortable with the safety - and certainly the convenience.

For a moment I worried that I'd lost the video of my mother's memorial, which would have deepened this event from an inconvenience into a tragedy. But no, iCloud had managed to back up the video, and I was able to check it this morning.

For all their power, the value of our computers is in the information we store: and by information, I really mean stories, memories, creative work, and the things we make. When I upgrade my laptop or my phone, I get the ability to take photos in a higher fidelity, or create new kinds of things. But that underlying human footprint - the trail of how I got to here, and most importantly, the people I knew and loved - transcends. I'm grateful that I don't need to worry about losing it. It's all just magically there, waiting for me.

Clearing the broken glass out of my car, on the other hand, was a real pain in the ass.


John Philpin : Lifestream

To put the ‘still counting’ but ‘already a landslide’ win fo

To put the ‘still counting’ but ‘already a landslide’ win for Newsom in the CA recall … he won Orange County!

To put the ‘still counting’ but ‘already a landslide’ win for Newsom in the CA recall … he won Orange County!

Thursday, 16. September 2021

John Philpin : Lifestream

Apologies for swamping the Micro Blog timeline - experimenti

Apologies for swamping the Micro Blog timeline - experimenting with Drafts and Obsidian working together and forgot that my blot site cross posted to micro.blog. Cleaning the offenders out now - I HOPE it won’t happen again.

Apologies for swamping the Micro Blog timeline - experimenting with Drafts and Obsidian working together and forgot that my blot site cross posted to micro.blog.

Cleaning the offenders out now - I HOPE it won’t happen again.


Bill Wendel's Real Estate Cafe

Real estate is rigged – Avoid 10 hidden costs of house hunting by being proactive

When asked to describe Real Estate Cafe’s business model over the past three decades, we’ve said (as we did in a recent podcast) that we… The post Real estate is rigged - Avoid 10 hidden costs of house hunting by being proactive first appeared on Real Estate Cafe.

When asked to describe Real Estate Cafe’s business model over the past three decades, we’ve said (as we did in a recent podcast) that we…

The post Real estate is rigged - Avoid 10 hidden costs of house hunting by being proactive first appeared on Real Estate Cafe.


Ludo Sketches

WE DID IT !

Exactly 11 years ago, I was out of Sun/Oracle and started to work for an emerging startup, with a bunch of former colleagues from Sun (although my official starting date in the French subsidiary is November 1st, which is also… Continue reading →

Exactly 11 years ago, I was out of Sun/Oracle and started to work for an emerging startup, with a bunch of former colleagues from Sun (although my official starting date in the French subsidiary is November 1st, which is also my birthdate).

A couple of pictures from my 1st company meeting, in Portugal, end of September 2010.

Fast forward 11 years…

Today is an huge milestone for ForgeRock. We are becoming a public company, with our stock publicly traded under the “FORG” symbol, at the New York Stock Exchange.

I cannot thank enough the founders of ForgeRock for giving me this gigantic opportunity to create the First ForgeRock Engineering Center just outside Grenoble, France, and to drive the destiny of very successful products, especially ForgeRock Directory Services.


John Philpin : Lifestream

A man who called himself a ‘concerned citizen’ and defended

A man who called himself a ‘concerned citizen’ and defended Theranos founder Elizabeth Holmes to reporters at her fraud trial turned out to be her boyfriend’s dad Of course he did.

“My goal isn’t to talk about “fixing mobile”. Mobile will,

“My goal isn’t to talk about “fixing mobile”. Mobile will, eventually, get there. Too many people think “Mobile is the Future” but we are so far past that. Mobile is the present. We need to actually be thinking about the future that is coming and what we are going to need.”

“My goal isn’t to talk about “fixing mobile”. Mobile will, eventually, get there. Too many people think “Mobile is the Future” but we are so far past that. Mobile is the present. We need to actually be thinking about the future that is coming and what we are going to need.”


Satire Site Gets Ridiculous Threat Letter From Baseball Team

Satire Site Gets Ridiculous Threat Letter From Baseball Team; cc’s Barbra Streisand In Its Response Somehow this has to stop. The lawsuits - not the responses.

Nader Helmy

Adding DID ION to MATTR VII

Since the beginning of our journey here at MATTR, decentralization and digital identity have been central to our approach to building products. As part of this, we’ve supported Decentralized Identifiers (or DIDs) since the earliest launch of our platform. We’ve also considered how we might give you more options to expand the utility of these identities over time. An important milestone The

Since the beginning of our journey here at MATTR, decentralization and digital identity have been central to our approach to building products. As part of this, we’ve supported Decentralized Identifiers (or DIDs) since the earliest launch of our platform. We’ve also considered how we might give you more options to expand the utility of these identities over time.

An important milestone

The W3C working group responsible for Decentralized Identifiers recently published the DID v1.0 specification under “Proposed Recommendation” status. This is a significant milestone as DIDs approach global standardization with the pending approval of the W3C Advisory Committee.

DIDs are maturing, but so is the environment and context in which they were originally designed. With a complex ecosystem consisting of dozens of different methodologies and new ones emerging on a regular basis, it’s important to balance the potential of this decentralized approach with a realistic approach for defining the real utility and value of each DID method. For example, the DID Method Rubric provides a good frame of reference for comparing different approaches.

Different types of DIDs can be registered and anchored using unique rules specific to the set of infrastructure where they’re stored. Since DIDs provide provenance for keys which are controlled by DID owners, the rules and systems that govern each kind of DID method have a significant impact on the trust and maintenance model for these identifiers. This is the key thing to remember when choosing a DID method that makes sense for your needs.

Our supported DID methods

In MATTR VII, by supporting a variety of DID methods — deterministic or key-based DIDs, domain-based DIDs, and ledger-based DIDs — we are able to provide tools which can be customized to fit the needs of individual people and organizations.

Key-based DIDs — Largely static, easy to create, and locally controlled. This makes them a natural choice for applications where there’s a need to manage connections and interactions with users directly. DIDs anchored to web domains — These have a different trust model, where control over the domain can bootstrap a connection to a DID. This makes a lot of sense for organizations with existing domain names that already transact and do business online, and can extend their brand and reputation to the domain of DIDs. Ledger-based DIDs — These offer a distributed system of public key infrastructure which is not centrally managed or controlled by a single party. While ledgers differ in their governance and consensus models, they ultimately provide a backbone for anchoring digital addresses in a way which allows them to be discovered and used by other parties. This can be a useful feature where a persistent identifier is needed, such as in online communication and collaboration.

There is no single DID method or type of DID (which at the moment) should be universally applied to every situation. However, by using the strengths of each approach we can allow for a diverse ecosystem of digital identifiers enabling connections between complex networks of people, organizations and machines.

To date, we’ve provided support for three main DID methods in our platform: DID Key, DID Web, and DID Sovrin. These align with three of the central types of infrastructure outlined above.

Introducing DID ION

We’re proud to announce that as of today we’ve added support for DID ION, a DID method which is anchored to IPFS and Bitcoin. We’ve supported the development of the Sidetree protocol that underpins DID ION for some time as it has matured in collaboration with working group members at the Decentralized Identity Foundation.

With contributions from organizations such as Microsoft, Transmute, and SecureKey, Sidetree and DID ION have emerged as a scalable and enterprise-ready solution for anchoring DIDs. The core idea behind the Sidetree protocol is to create decentralized identifiers that can run on any distributed ledger system. DID ION is an implementation of that protocol which backs onto the Bitcoin blockchain, one of the largest and most used public ledger networks in the world.

Sidetree possesses some unique advantages not readily present in other DID methods, such as low cost, high throughput, and built-in portability of the identifier. This provides a number of benefits to people and organizations, especially in supporting a large volume of different kinds of connections with the ability to manage and rotate keys as needed. We have added end-to-end capabilities for creating and resolving DIDs on the ION network across our platform and wallet products.

Although DID ION is just one implementation of the Sidetree protocol, we see promise in other DID methods using Sidetree and will consider adding support for these over time as and when it makes sense. We’ll also continue to develop Sidetree in collaboration with the global standards community to ensure that this protocol and the ION Network have sustainable futures for a long time to come.

At the same time, the community around DID Sovrin is developing a new kind of interoperability by designing a DID method that can work for vast networks of Indy ledgers, rather than focusing on the Sovrin-specific method that’s been used to date. As DID Sovrin gets phased out of adoption, we’re simultaneously deprecating standard support for DID Sovrin within MATTR VII. We’ll be phasing this out shortly with upcoming announcements for customers building on our existing platform.

If you’ve got any use cases that utilize DID Sovrin or want to discuss extensibility options, please reach out to us on any of our social channels or at info@mattr.global and we’ll be happy to work with you.

Looking ahead

We believe this a big step forward in providing a better set of choices when it comes to digital identity for our customers. From the start, we have designed our platform with flexibility and extensibility in mind, and will continue to support different DID methods as the market evolves.

We look forward to seeing how these new tools can be used to solve problems in the real world and will keep working to identify better ways to encourage responsible use of digital identity on the web.

Adding DID ION to MATTR VII was originally published in MATTR on Medium, where people are continuing the conversation by highlighting and responding to this story.

Wednesday, 15. September 2021

John Philpin : Lifestream

🎶🎵SoundCloud Says Portishead Song Earned 500% More Under New

🎶🎵SoundCloud Says Portishead Song Earned 500% More Under New Royalty Plan Reminds me of a question I have about SetApp. How are the developers rewarded? They get $$s because - I downloaded it? - I used it? (this month?) - how much I use it?

🎶🎵SoundCloud Says Portishead Song Earned 500% More Under New Royalty Plan

Reminds me of a question I have about SetApp.

How are the developers rewarded?

They get $$s because - I downloaded it? - I used it? (this month?) - how much I use it?


Bill Wendel's Real Estate Cafe

Only pay for what you need: Will DOJ vs NAR result in fee-for-service future?

For the second time in recent weeks, an industry thought-leader has pointed to Fee-for-Service / Menu of Service business models in his ongoing series: Lesson… The post Only pay for what you need: Will DOJ vs NAR result in fee-for-service future? first appeared on Real Estate Cafe.

For the second time in recent weeks, an industry thought-leader has pointed to Fee-for-Service / Menu of Service business models in his ongoing series: Lesson…

The post Only pay for what you need: Will DOJ vs NAR result in fee-for-service future? first appeared on Real Estate Cafe.


blog.deanland.com

Oh, It Drives Me Nuts

My Drupal setup is broken. Some server issue. I have no idea how to fix it. My guess is the alert part that renders when the blog is visited will go away once I clean out all the Spam messages. That worked once before. But ... last time I tried to do that, it wouldn't let me do that. Yeah. Locked out of capability on my own blog due to some glitch. Where can I find someone who REALLY KNO

My Drupal setup is broken. Some server issue. I have no idea how to fix it. My guess is the alert part that renders when the blog is visited will go away once I clean out all the Spam messages. That worked once before.

But ... last time I tried to do that, it wouldn't let me do that. Yeah. Locked out of capability on my own blog due to some glitch.

Where can I find someone who REALLY KNOWS THEIR STUFF around a server and around Drupal to help me out of this? Technical assistance is n order. Know such a person?

read more


Oh, It Drives Me Nuts

My Drupal setup is broken. Some server issue. I have no idea how to fix it. My guess is the alert part that renders when the blog is visited will go away once I clean out all the Spam messages. That worked once before. But ... last time I tried to do that, it wouldn't let me do that. Yeah. Locked out of capability on my own blog due to some glitch. Where can I find someone who REALLY KNO

My Drupal setup is broken. Some server issue. I have no idea how to fix it. My guess is the alert part that renders when the blog is visited will go away once I clean out all the Spam messages. That worked once before.

But ... last time I tried to do that, it wouldn't let me do that. Yeah. Locked out of capability on my own blog due to some glitch.

Where can I find someone who REALLY KNOWS THEIR STUFF around a server and around Drupal to help me out of this? Technical assistance is n order. Know such a person?

read more


Identity Praxis, Inc.

Mobile Messaging, it’s more than 160 characters! It Is Time to Get Strategic

by Jay O’Sullivan, Michael J. Becker Over the past several years, the world of messaging has morphed in front of our very eyes. There are now five messaging channels, including text (inc. SMS, MMS, RCS), email, social media, chatbots (i.e. for support and conversational commerce), proximity alerts, and more than a dozen over-the-top apps (WhatsApp, WeChat, Viber, […] The post Mobile Me

by Jay O’SullivanMichael J. Becker

Over the past several years, the world of messaging has morphed in front of our very eyes. There are now five messaging channels, including text (inc. SMS, MMS, RCS), email, social media, chatbots (i.e. for support and conversational commerce), proximity alerts, and more than a dozen over-the-top apps (WhatsApp, WeChat, Viber, Apple Business Chat, Line, iMessage, Facebook Messenger, Pinterest, YouTube, Instagram, LinkedIn, Twitter, and more). What does this mean for organizations? Well, for some they may think it means that their organization has a variety of channels to choose from to reach the people they serve (aka consumers, shoppers, patients, investors, etc.). On the surface, this is true. But what it really means is that organizations must start developing their multi-channel messaging muscles so that they can reach individuals not on the organization’s preferred time and medium of communication, but on the individual’s preferred time and medium of communication.

To effectively manage a messaging program, the most important thing to realize is that messaging channels are NOT all created equal. Every channel has a different audience profile, audience expectations, norms, message lengths, message formats, ways to send and receive messages, and methods for reporting across the engagement continuum (e.g., transaction through relationship). Messaging is an ecosystem all of its own that must be nurtured if you want to find success.

Need I say more? Of course, I do!

“The customer is everywhere and nowhere.” Todd Harrison, SVP of Digital at Skechers (Harrison, 2021)

You may find yourself with prospects and customers and few ways to reach them? You may find yourself with tired campaigns that simply do not perform the way you want them to? You may find yourself losing touch with your most valuable asset, your first-party database. You may ask yourself, how did I get here? You may ask yourself, which channel should I use to achieve the best results?

Todd is right, “the customer is everywhere, and nowhere.” The way to address this problem is to put your customer at the heart of your business. What does this mean? It means you must learn to collect and listen to their preferences? To interact and engage them, not just barrage them with mono-directional messages that basically say, “I have an idea, why don’t you buy more stuff from us?” People want to be respected, heard, and served. You need to treat them as the hero of your story. To make this happen, you need to understand them. You need to focus on building out an end-to-end communication strategy, which includes building a preference-based opt-in database, often referred to as a customer data platform (CDP). Depending on your needs you’ll need dedicated messaging platforms (e.g. SMS, 10DLC, Email, etc.) or possibly a multichannel communication platform as a service (CPasS) to manage real-time messaging across all channels. Moreover, you’ll need a content strategy. And soon, to meet the expectations of those you serve, you’ll need to build the capability to deliver predictive, personalized, contextually aware content and offers, with real-time feedback loops so that you can listen to customers and respond to them when they reply or initiate a conversation with you. In the near future, we will find that messaging has become the cornerstone of the vast majority of businesses’ engagement strategies, but we have a long way to go as only six to ten percent of companies are actively nurturing and running commercial messaging programs today (Ruppert, 2021).

Yes, you can succeed with the tried and true tactics, but for how long? The marketplace and consumer sentiments are changing, and you must change with them, or you’ll be left behind.

“By 2030, society will no longer tolerate a business model that relies on mass transactions of increasingly sensitive personal data: a different system will be in place.” (Data 2030, 2020)

We’ll cover more on the bigger picture in future articles, let’s get back to messaging.

What it takes to run a successful messaging program

The key to a successful messaging program is to start simple and build from there.

You should not be frightened by the sea of mobile messaging opportunities. Embrace them, and take them one step at a time. Take it from us, with the simplest of messaging programs you have the potential to see material success. For example, we are aware of a retail company that, in 12 months after launching a text-based SMS program sprinkled with the occasional MMS message, acquired 700,000 subscribers and is now generating over $1.4M in monthly sales.

Photo by Daria Nepriakhina on Unsplash

Where do you start?

SMS, or text messaging, is the most straightforward, omnipresent, and ubiquitous messaging channel that you can use to engage people anywhere in the world. And, best of all, it is proven. Open rates in SMS are often 95% and higher.

Text messaging is not just about driving sales, and although this is the end-game, it is about being of service to your audience throughout every stage of the relationship they have with you. Text messaging provides at the most basic level more engagement not just as a utility but as a consumer relationship tool – securing interactions, driving pre and post-sales engagements, gathering feedback through surveys, and fostering loyalty and support. You can, and should, use it across every stage of the purchase funnel: discovery, awareness, consideration, conversation, onboarding and adoption, loyalty, support, and offboarding.

NOTE: Keep an eye out for 10DLC, a new messaging standard the kicked off in June 2021. Our next article will be on this.

But What if?

Yes, text messaging, aka SMS, is ubiquitous, but what should you do when you need to grow beyond what texting has to offer? Remember, “the medium is the message” (Marshall McLuhan, 1964). Text messaging is not the right channel for every engagement.

What if you needed to reach and engage people in China, Brazil, Germany, or Australia? What if your product was best suited to be explained via a picture or a video? What are the best channels to reach these consumers? Text is not always the answer!

Photo by Nathan Dumlao on Unsplash

Facebook Messenger, Instagram, WhatsApp, and other messaging channels are making a considerable play at being complementary messaging options for consumers and, in some cases, the primary channel. Geographic location is the main driver for these channel decisions as well as the use case. But the numbers don’t lie in regards to the global user base of over 2 billion consumers.  OTT Messaging Leaders. 

Where Should You Start?

Sounds like a lot? Sounds complicated? It’s really not if you think about these few items and take them one at a time in your messaging and engagement roadmap.

Here are a few things for you to consider that will help mold your direction. First, start with a few pillars: Message type (use-case, transactional, marketing, support…), Geolocation(s), Staffing, and Existing Partners. Consider,

Are you currently offering text transactional or marketing messaging? Which markets and countries are you looking to reach? How much is your social media advertising budget? Are you considering messaging for support (don’t just think SMS, think OTT)? Do you have internal staff to manage your messaging programs? Are you currently using a Marketing Automation platform? Are you working with external guides and mentors (Remember: coaches and mentors are required if you want to be an expert )? Enter Personal Data & Identity

It is critical for you to remember that all effective commerce starts with a meaningful connection and relevant and consistent communication. The path to relevance is through data, particularly personal data, in fact, a new asset class that you need to wield with precision.

The Bottomline

For the majority SMS (text) will be your first entry into the messaging space. But, depending on how you answered the questions above, the odds are you will be a perfect candidate to implement one of the other messaging channels too. Your next step is to evaluate your current programs and roadmap of “Needs” vs ’Wants” and map out the use cases, i.e., the experience you want to offer your customers, prior to finding a good technology and solutions partner.

Your “Needs’ may have to do with driving revenue, loyalty sign-ups, gathering preference data, support efficiencies, personalization, and could be as simple as mapping out your KPI’s and creating efficiencies of your current messaging strategy. Day, time, type of message, MMS vs. SMS, frequency, and more have a say in the results in your program. Remember, planning before tactics!

Messaging has proven to be the most effective tool, especially when used in a thoughtful and meaningful experience, for people-centric engagement programs. But in the age of the connected individual, the bottom line is that it is a necessity, not a nice to have.

There is a lot to navigate, but this is nothing like any other part of your business. You can do it. Create a roadmap. Find your partners. Take one step at a time!

REFERENCES

Data 2030: What does the future of data look like? | WPP. (2020). WPP. https://www.wpp.com/wpp-iq/2020/11/data-2030—what-does-the-future-of-data-look-like

Harrison, T. (2021). Todd Harrison, SVP Digital, Skechers. In LinkedIn. https://www.linkedin.com/in/livetomountainbike/

Most popular global mobile messenger apps as of July 2021, based on number of monthly active users (Most Popular Global Mobile Messenger Apps as of July 2021, Based on Number of Monthly Active Users, 2021)

Most popular global mobile messenger apps as of July 2021, based on number of monthly active users. (2021). Statista. https://www.statista.com/statistics/258749/most-popular-global-mobile-messenger-apps/

Ruppert, P. (2021, September 15). PD&I Market Assessment Interview with Paul Rupert (M. Becker, Interviewer) [Zoom].

The post Mobile Messaging, it’s more than 160 characters! It Is Time to Get Strategic appeared first on Identity Praxis, Inc..


John Philpin : Lifestream

Yeah yeah yeah …. The Full Story

Yeah yeah yeah …. The Full Story

Yeah yeah yeah ….

The Full Story


📚 literal.club

📚 literal.club

And backacha

And backacha

And backacha

Tuesday, 14. September 2021

John Philpin : Lifestream

The End of American Exceptionalism … it’s a couple of yea

The End of American Exceptionalism … it’s a couple of years old - and has too many tRump references … but he’s not wrong … if it ever was a reality.

The End of American Exceptionalism

… it’s a couple of years old - and has too many tRump references … but he’s not wrong … if it ever was a reality.


“To see ourselves as others see us is a most salutary gift

“To see ourselves as others see us is a most salutary gift. Hardly less important is the capacity to see others as they see themselves.” Aldous Huxley via the wonderful Maria Korpova

“To see ourselves as others see us is a most salutary gift. Hardly less important is the capacity to see others as they see themselves.”

Aldous Huxley

via the wonderful Maria Korpova


As you watch the politics of Texas and Florida ( specificall

As you watch the politics of Texas and Florida ( specifically Abbot and DeSantis ) … you wonder why all those hip technocrats moving there to avoid Californian taxes aren’t speaking up?

As you watch the politics of Texas and Florida ( specifically Abbot and DeSantis ) … you wonder why all those hip technocrats moving there to avoid Californian taxes aren’t speaking up?


Mike Jones: self-issued

OpenID Connect Presentation at 2021 European Identity and Cloud (EIC) Conference

I gave the following presentation on the OpenID Connect Working Group during the September 13, 2021 OpenID Workshop at the 2021 European Identity and Cloud (EIC) conference. As I noted during the talk, this is an exciting time for OpenID Connect; there’s more happening now than at any time since the original OpenID Connect specs […]

I gave the following presentation on the OpenID Connect Working Group during the September 13, 2021 OpenID Workshop at the 2021 European Identity and Cloud (EIC) conference. As I noted during the talk, this is an exciting time for OpenID Connect; there’s more happening now than at any time since the original OpenID Connect specs were created!

OpenID Connect Working Group (PowerPoint) (PDF)

@_Nat Zone

GAIN – Global Assured Identity Network を発表しました。

現地(ミュンヘン)時間13日午後7時半、Europ… The post GAIN – Global Assured Identity Network を発表しました。 first appeared on @_Nat Zone.

現地(ミュンヘン)時間13日午後7時半、European Identity and Cloud Conference 2021で GAIN – Global Assured Identity Network を発表しました。これは、すべての参加者が身元確認済みであるオーバレイネットワークです。

使用したスライドは以下からご覧になっていただけます。

ホワイトペーパー自体は、https://gainforum.org/ からダンロードしていただけます。

参加にあたっての問い合わせは、DigitalTrust _at_ iif.com (International Institute of Finance内事務局。_at_ を @ で置き換えてお使いください)にお願いいたします。また、POC参加に関するお問い合わせは donna _at_ oidf.org にお願いします

GAINとは

詳細は以下の英文ブログに譲りますが、その下にDeepLによる機械翻訳も参考までに載せておきます。(検証していないので間違いや割愛等結構あるかもしれません。が、なんとなく概要を見るのには良いでしょう。原本は英語ですので、詳しく見るにはそちらを御覧ください。)

Announcing GAIN: Global Assured Identity Network

インターネットの始まりには、信頼がありました。誰もが参加している組織に知られており、誰もが責任(Accountability)を負わされることを知っていたのです。

しかし、90年代にインターネットが商業的に利用できるようになると、信頼は失われました。参加者の大半が匿名になり、責任を問われないと認識するようになったのです。その結果、多くの犯罪者やバッドアクターがネットワーク上で活躍するようになりました。インターネットのワイルド・ワイルド・ウエストの時代が到来したのです。

それ以来、状況を改善し、信頼を回復するために多くの努力が払われてきました。しかし、あまり効果はありませんでした。金融犯罪は、人への影響が計り知れない不正行為から生まれ、世界経済に年間GDPの最大5%もの損失を与えています。マネーロンダリング防止やテロ資金対策に莫大な費用がかけられていますが、今のところ効果はありません。金融システム内の1,000ドルの「違法な資金」に対して、コンプライアンスに100ドルが費やされていますが、傍受されるのはわずか1ドル。わずか0.1%です。

このコインの裏側には金融包摂の問題もあります。

多くの人々は、口座開設に関わる本人確認コストのために商業的にとりこむことができず、経済的に排除されています。

匿名で行動できることは、原理的にはプライバシーの面で大きなメリットがあります。しかし、私たちが実現しているのは、個人データが危険にさらされ、善良な行為者が悪質な行為者に容易に追跡されるという状況です。いわば熟練した悪質な行為者のためのプライバシーであり、われわれ「それ以外の人」のためのプライバシーはありません。

30年の時間とコストをかけたにもかかわらず、なぜ私たちはこれらを阻止することに成功しないのでしょうか。

それは、アイデンティティとアカウンタビリティという根本的な問題に取り組んでこなかったからです。

私たちが必要としているのは、エコシステム内のすべての参加者のアカウンタビリティを再確立することです。

このようなエコシステムは、比較可能性と相互承認の原則に基づいて相互に接続することができ、最終的にはサイバーワールドの人口の大部分をカバーする説明責任のあるエコシステムのネットワークを形成することができます。

そのようなエコシステムのひとつが、今日、私たちが提案するものです。GAIN(Global Assured Identity Network)です。

GAINは、アカウンタビリティのある参加者のみで構成されるインターネット上のオーバーレイネットワークです。すべての参加者は、主に銀行やその他の規制対象組織からなるホスティング組織で口座を開設する際に、規制を満たすための身元確認を受けています。これらの高保証属性は、エンドユーザーの指示に基づいて アイデンティティ情報提供者から依拠当事者(RP)に渡される アイデンティティ情報の基礎として使用されます。

エンド・ユーザーは、自己申告ではなく、信頼できるエンティティによって証明された年齢などの属性を証明できるようになります。これは非常に強力です。

例えば、「私は18歳以上です」ということを、「私を信じてください」と言うだけでなく、銀行の証明書を使って証明することができます。

これは、プライバシーの観点からも非常に良いことです。

自己申告の場合、情報の受け手は自分の言っていることが信じられないので、通常、それを証明するために運転免許証などの身分証明書を提示しなければなりません。この時点で、彼女は年齢よりもはるかに多くの属性を開示していることになります。

このような能力は、以前に行われた 身元確認にかかっています。身元確認は非常にコストのかかるプロセスです。

この目的のためだけにそれを行おうとすると、法外な費用がかかる可能性があります。

本業のために身元確認を行う必要がある銀行やその他の規制機関は、この点で有利です。これは、ある巨大なオンライン小売業者が、ホリデーシーズンにしか必要とされない冗長機能をより有効に活用するために、クラウドコンピューティングサービスを提供し始めたようなものです。他のホスティングベンダーは太刀打ちできませんでした。

アイデンティティ情報提供者によってIDが証明されたからといって、個人がGAINで匿名で行動できないわけではありません。個人は、マーチャントやネットワークの他の参加者に対して、匿名または偽名で行動することができます。実際、個人の属性の開示は最小限にとどめるのが普通である。ただし、不正行為を行った場合には、正当な手続きに基づいて個人を追跡し、責任を追及することができるようになっています。指定されたオープナーと呼ばれるエンティティは、取引の履歴を開き、その時の人物を指し示すことができる。

ネットワーク上のすべてのビジネスエンティティは、ホスティング組織でのビジネスアカウントの作成と維持の一環として審査されており、匿名のままでいることはできません。彼らには説明責任があります。その結果、消費者は安心して取引を行うことができ、依拠当事者にとってはビジネスの拡大につながります。すべての善良なアクターが利益を得ることができるのです。

この効果はグローバルなものです。マーチャントは、一度登録して契約するだけで、世界中のネットワーク参加者にアクセスできるようになります。

個人は、エコシステム内のすべての依拠当事者が存在し、説明責任を果たしているという安心感を持ってアクセスすることができます。確かに悪質な業者はまだ存在するでしょうが、その範囲はより限定されたものになります。エコシステムは、現在のインターネットよりもはるかに安全になります。参加者がアクションを起こすのに十分なものになるでしょう。信頼が再び確立されます。

完璧なプロセスはありません。GAINにおいても、悪質な行為者は確実に存在しますが、それは標準ではなく例外となるため、問題ははるかに扱いやすくなります。行動を起こすには「十分」でしょう。

つまり、信頼が再確立されたのです。

Hellō

なお、これとは独立に Dick Hardt によって Hellō が発表されました。

Hellō のなかでも何度かGAINが言及されましたが、おそらく相互接続していくことになると思います。

They look quite complementary to me!

— Dick Hardt (@DickHardt) September 14, 2021
EIC 2021の風景 (C) Nat Sakimura

The post GAIN – Global Assured Identity Network を発表しました。 first appeared on @_Nat Zone.

John Philpin : Lifestream

Future-citizen skills McKinsey : three things that ‘human

Future-citizen skills McKinsey : three things that ‘humans’ will need in the future ‘world of work’ Number one add value beyond what can be done by automated systems and intelligent machines Shouldn’t we be asking what value machines can add beyond what we humans can be …?

Future-citizen skills

McKinsey : three things that ‘humans’ will need in the future ‘world of work’

Number one

add value beyond what can be done by automated systems and intelligent machines

Shouldn’t we be asking what value machines can add beyond what we humans can be …?


How can there be so much to do that I don’t want to do?

How can there be so much to do that I don’t want to do?

How can there be so much to do that I don’t want to do?


Natural Framing

Natural Framing

Natural Framing


Mailchimp is being bought by Intuit. 1) The email service

Mailchimp is being bought by Intuit. 1) The email services are once again being rolled up and merged into their big brothers arms. 2) Clearly an indication that Intuit is intending to flex its muscles outside of accounts.

Mailchimp is being bought by Intuit.

1) The email services are once again being rolled up and merged into their big brothers arms.

2) Clearly an indication that Intuit is intending to flex its muscles outside of accounts.

Monday, 13. September 2021

Simon Willison

Datasette Desktop 0.2.0: The annotated release notes

Datasette Desktop is a new macOS desktop application version of Datasette, an "open source multi-tool for exploring and publishing data" built on top of SQLite. I released the first version last week - I've just released version 0.2.0 (and a 0.2.1 bug fix) with a whole bunch of critical improvements. You can see the release notes for 0.2.0 here, but as I've done with Datasette in the past I've d

Datasette Desktop is a new macOS desktop application version of Datasette, an "open source multi-tool for exploring and publishing data" built on top of SQLite. I released the first version last week - I've just released version 0.2.0 (and a 0.2.1 bug fix) with a whole bunch of critical improvements.

You can see the release notes for 0.2.0 here, but as I've done with Datasette in the past I've decided to present an annotated version of those release notes providing further background on each of the new features.

The plugin directory

A new plugin directory for installing new plugins and upgrading or uninstalling existing ones. Open it using the "Plugins -> Install and Manage Plugins..." menu item. #74

This was the main focus for the release. Plugins are a key component of both Datasette and Datasette Desktop: my goal is for Datasette to provide a robust core for exploring databases, with a wide array of plugins that support any additional kind of visualization, exploration or data manipulation capability that a user might want.

Datasette Desktop goes as far as bundling an entire standalone Python installation just to ensure that plugins will work correctly, and invisibly sets up a dedicated Python virtual environment for plugins to install into when you first run the application.

The first version of the app allowed users to install plugins by pasting their name into a text input field. Version 0.2.0 is a whole lot more sophisticated: the single input field has been replaced by a full plugin directory interface that shows installed v.s. available plugins and provides "Install", "Upgrade" and "Uninstall" buttons depending on the state of the plugin.

When I set out to build this I knew I wanted to hit this JSON API on datasette.io to fetch the list of plugins, and I knew I wanted a simple searchable index page. The I realized I also wanted faceted search, so I could filter for installed vs not-yet-installed plugins.

Datasette's built-in table interface already implements faceted search! So I decided to use that, with some custom templates to add the install buttons and display the plugins in a more suitable format.

The first challenge was getting the latest list of plugins into my Datasette instance. I built this into the datasette-app-support plugin using the startup() plugin hook - every time the server starts up it hits that API and populates an in-memory table with the returned data.

The data from the API is then extended with four extra columns:

"installed" is set to "installed" or "not installed" depending on whether the plugin has already been installed by the user "Installed_version" is the currently installed version of the plugin "upgrade" is the string "upgrade available" or None - allowing the user to filter for just plugins that can be upgraded "default" is set to 1 if the plugin is a default plugin that came with Datasette

The data needed to build the plugin table is gathered by these three lines of code:

plugins = httpx.get( "https://datasette.io/content/plugins.json?_shape=array" ).json() # Annotate with list of installed plugins installed_plugins = { plugin["name"]: plugin["version"] for plugin in (await datasette.client.get("/-/plugins.json")).json() } default_plugins = (os.environ.get("DATASETTE_DEFAULT_PLUGINS") or "").split()

The first line fetches the full list of known plugins from the Datasette plugin directory

The second makes an internal API call to the Datasette /-/plugins.json endpoint using the datasette.client mechanism to discover what plugins are currently installed and their versions.

The third line loads a space-separated list of default plugins from the DATASETTE_DEFAULT_PLUGINS environment variable.

That last one deserves further explanation. Datasette Desktop now ships with some default plugins, and the point of truth for what those are lives in the Electron app codebase - because that's where the code responsible for installing them is.

Five plugins are now installed by default: datasette-vega, datasette-cluster-map, datasette-pretty-json, datasette-edit-schema and datasette-configure-fts. #81

The plugin directory needs to know what these defaults are so it can avoid showing the "uninstall" button for those plugins. Uninstalling them currently makes no sense because Datasette Desktop installs any missing dependencies when the app starts, which would instantly undo the user's uninstall action decision.

An environment variable felt like the most straight-forward way to expose that list of default plugins to the underlying Datasette server!

I plan to make default plugins uninstallable in the future but doing so require a mechanism for persisting user preference state which I haven't built yet (see issue #101).

A log on the loading screen

The application loading screen now shows a log of what is going on. #70

The first time you launch the Datasette Desktop application it creates a virtual environment and installs datasette, datasette-app-support and the five default plugins (plus their dependencies) into that environment.

This can take quite a few seconds, during which the original app would show an indeterminate loading indicator.

Personally I hate loading indicators which don't show the difference between something that's working and something that's eternally hung. Since I can't estimate how long it will take, I decided to pipe the log of what the pip install command is doing to the loading screen itself.

For most users this will be meaningless, but hopefully will help communicate "I'm installing extra stuff that I need". Advanced users may find this useful though, especially for bug reporting if something goes wrong.

Under the hood I implemented this using a Node.js EventEmitter. I use the same trick to forward server log output to the "Debug -> Show Sever Log" interface.

Example CSV files

The welcome screen now invites you to try out the application by opening interesting example CSV files, taking advantage of the new "File -> Open CSV from URL..." feature. #91

Previously Datasette Desktop wouldn't do anything at all until you opened up a CSV or SQLite database, and I have a hunch that unlike me most people don't have good examples of those to hand at all times!

The new welcome screen offers example CSV files that can be opened directly from the internet. I implemented this using a new API at datasette.io/content/example_csvs (add .json for the JSON version) which is loaded by code running on that welcome page.

I have two examples at the moment, for the Squirrel Census and the London Fire Brigade's animal rescue data. I'll be adding more in the future.

The API itself is a great example of the Baked Data architectural pattern in action: the data itself is stored in this hand-edited YAML file, which is compiled to SQLite every time the site is deployed.

To get this feature working I added a new "Open CSV from URL" capability to the app, which is also available in the File menu. Under the hood this works by passing the provided URL to the new /-/open-csv-from-url API endpoint. The implementation of this was surprisingly fiddly as I wanted to consume the CSV file using an asynchronous HTTP client - I ended up using an adaption of some example code from the aiofile README.

Recently opened files and "Open with Datasette"

Recently opened .db and .csv files can now be accessed from the new "File -> Open Recent" menu. Thanks, Kapilan M! #54

This was the project's first external contribution! Kapilan M figured out a way to hook into the macOS "recent files" mechanism from Electron, and I expanded that to cover SQLite database in addition to CSV files.

When a recent file is selected, Electron fires the "open-file" event. This same event is fired when a file is opened using "Open With -> Datasette" or dragged onto the application's dock.

This meant I needed to tell the difference between a CSV or a SQLite database file, which I do by checking if the first 16 bytes of the file match the SQLite header of SQLite format 3\0.

.db and .csv files can now be opened in Datasette starting from the Finder using "Right Click -> Open With -> Datasette". #40

Registering Datasette as a file handler for .csv and .db was not at all obvious. It turned out to involve adding the following to the Electron app's package.json file:

"build": { "appId": "io.datasette.app", "mac": { "category": "public.app-category.developer-tools", "extendInfo": { "CFBundleDocumentTypes": [ { "CFBundleTypeExtensions": [ "csv", "tsv", "db" ], "LSHandlerRank": "Alternate" } ] } The Debug Menu

A new Debug menu can be enabled using Datasette -> About Datasette -> Enable Debug Menu".

The debug menu existed previously in development mode, but with 0.2.0 I decided to expose it to end users. I didn't want to show it to people who weren't ready to see it, so you have to first enable it using a button on the about menu.

The most interesting option there is "Run Server Manually".

Most of the time when you are using the app there's a datasette Python server running under the hood, but it's entirely managed by the Node.js child_process module.

When developing the application (or associated plugins) it can be useful to manually run that server rather than having it managed by the app, so you can see more detailed error messages or even add the --pdb option to drop into a debugger should something go wrong.

To run that server, you need the Electron app to kill its own version... and you then need to know things like what port it was running on and which environment variables it was using.

Here's what you see when you click the "Run Server Manually" debug option:

Here's that command in full:

DATASETTE_API_TOKEN="0ebb45444ba4cbcbacdbcbb989bb0cd3aa10773c0dfce73c0115868d0cee2afa" DATASETTE_SECRET="4a8ac89d0d269c31d99059933040b4511869c12dfa699a1429ea29ee3310a850" DATASETTE_DEFAULT_PLUGINS="datasette datasette-app-support datasette-vega datasette-cluster-map datasette-pretty-json datasette-edit-schema datasette-configure-fts datasette-leaflet" /Users/simon/.datasette-app/venv/bin/datasette --port 8002 --version-note xyz-for-datasette-app --setting sql_time_limit_ms 10000 --setting max_returned_rows 2000 --setting facet_time_limit_ms 3000 --setting max_csv_mb 0

This is a simulation of the command that the app itself used to launch the server. Pasting that into a terminal will produce an exact copy of the original process - and you can add --pdb or other options to further customize it.

Bonus: Restoring the in-memory database on restart

This didn't make it into the formal release notes, but it's a fun bug that I fixed in this release.

Datasette Desktop defaults to opening CSV files in an in-memory database. You can import them into an on-disk database too, but if you just want to start exploring CSV data in Datasette I decided an in-memory database would be a better starting point.

There's one problem with this: installing a plugin requires a Datasette server restart, and restarting the server clears the content of that in-memory database, causing any tables created from imported CSVs to disappear. This is confusing!

You can follow my progress on this in issue #42: If you open a CSV and then install a plugin the CSV table vanishes. I ended up solving it by adding code that dumps the "temporary" in-memory database to a file on disk before a server restart, restarts the server, then copies that disk backup into memory again.

This works using two custom API endpoints added to the datasette-app-support plugin:

POST /-/dump-temporary-to-file with {"path": "/path/to/backup.db"} dumps the contents of that in-memory temporary database to the specified file. POST /-/restore-temporary-from-file with {"path": "/path/to/backup.db"} restors the content back again.

These APIs are called from the startOrRestart() method any time the server restarts, using a file path generated by Electron using the following:

backupPath = path.join( app.getPath("temp"), `backup-${crypto.randomBytes(8).toString("hex")}.db` );

The file is deleted once it has been restored.

After much experimentation, I ended up using the db.backup(other_connection) method that was added to Python's sqlite3 module in Python 3.7. Since Datasette Desktop bundles its own copy of Python 3.9 I don't have to worry about compatibility with older versions at all.

The rest is in the milestone

If you want even more detailed notes on what into the release, each new feature is included in the 0.2.0 milestone, accompanied by a detailed issue with screenshots (and even a few videos) plus links to the underlying commits.


John Philpin : Lifestream

Thinking. I should try it more.

Thinking. I should try it more.

Thinking.

I should try it more.


Thankyou @lmika

Thankyou @lmika

Thankyou @lmika


Phil Windley's Technometria

Toothbrush Identity

Summary: Identity finds its way into everything—even toothbrushes. Careful planning can overcome privacy concerns to yield real benefits to businesses and customers alike. I have a Philips Sonicare toothbrush. One of the features is a little yellow light that comes on to tell me that the head needs to be changed. The first time the light came on, I wondered how I would reset it once I

Summary: Identity finds its way into everything—even toothbrushes. Careful planning can overcome privacy concerns to yield real benefits to businesses and customers alike.

I have a Philips Sonicare toothbrush. One of the features is a little yellow light that comes on to tell me that the head needs to be changed. The first time the light came on, I wondered how I would reset it once I got a new toothbrush head. I even googled it to find out.

Turns out I needn't have bothered. Once I changed the head the light went off. This didn't happen when I just removed the old head and put it back on. The toothbrush heads have a unique identity that the toothbrush recognizes. This identity is not only used to signal head replacement, but also to put the toothbrush into different modes based on the type of head installed.

Philips calls this BrushSync, but it's just RFID technology underneath the branding. Each head has an RFID chip embedded in it and the toothbrush body reads the data off the head and adjusts its internal state in the appropriate way.

I like this use case RFID because it's got clear benefits for both Philips and their customers. Philips sells more toothbrush heads—so the internet of things (IoT) use case is clearly aligned with business goals. Customers get reminders to replace their toothbrush head and can reset the reminder by simply doing what they'd do anyway—switch the head

There aren't many privacy concerns at present. But as more and more products include RFID chips, you could imagine scanners on garbage trucks that correlate what gets used and thrown out with an address. I guess we need garbage cans that can disable RFID chips when they're thrown away.

I was recently talking to a friend of mine, Eric Olafson, who is a founding investor in Riot. Riot is another example of how thoughtfully applied RFID-based identifiers can solve business and customer problems. Riot creates tech that companies can use for RFID-based, in-store inventory management. This solves a big problem for stores that often don't know what inventory they have on hand. With Riot, a quick scan of the store each morning updates the inventory management system, showing where the inventory data is out of sync with the physical inventory. As more and more of us go to the physical store because the app told us they had the product we wanted, it's nice to know the app isn't lying. Riot puts the RFID on the tag, not the clothing, dealing with many of the privacy concerns.

Both BrushSync and Riot use identity to solve business problems, showing that unique identifiers on individual products can be good for business and customers alike. This speaks to the breadth of identity and its importance in areas beyond associating identifiers with people. I've noticed an uptick in discussions at IIW about identity for things and the impact that can have. The next IIW is Oct 12-14—online—join us if you're interested.

Photo Credit: SoniCare G3 from Philips USA (fair use)

Tags: identity iot rfid


Damien Bod

Implementing Angular Code Flow with PKCE using node-oidc-provider

This posts shows how an Angular application can be secured using Open ID Connect code flow with PKCE and node-oidc-provider identity provider. This requires the correct configuration on both the client and the identity provider. The node-oidc-provider clients need a configuration for the public client which uses refresh tokens. The grant_types ‘refresh_token’, ‘authorization_code’ are added […]

This posts shows how an Angular application can be secured using Open ID Connect code flow with PKCE and node-oidc-provider identity provider. This requires the correct configuration on both the client and the identity provider.

The node-oidc-provider clients need a configuration for the public client which uses refresh tokens. The grant_types ‘refresh_token’, ‘authorization_code’ are added as well as the offline_access scope.

clients: [ { client_id: 'angularCodeRefreshTokens', token_endpoint_auth_method: 'none', application_type: 'web', grant_types: [ 'refresh_token', 'authorization_code' ], response_types: ['code'], redirect_uris: ['https://localhost:4207'], scope: 'openid offline_access profile email', post_logout_redirect_uris: [ 'https://localhost:4207' ] } ,]

The Angular client is implemented using angular-auth-oidc-client. The offline_access scope is requested as well as the prompt=consent. The nonce validation after a refresh is ignored.

import { NgModule } from '@angular/core'; import { AuthModule, LogLevel } from 'angular-auth-oidc-client'; @NgModule({ imports: [ AuthModule.forRoot({ config: { authority: 'http://localhost:3000', redirectUrl: window.location.origin, postLogoutRedirectUri: window.location.origin, clientId: 'angularCodeRefreshTokens', scope: 'openid profile offline_access', responseType: 'code', silentRenew: true, useRefreshToken: true, logLevel: LogLevel.Debug, ignoreNonceAfterRefresh: true, customParams: { prompt: 'consent', // login, consent }, }, }), ], exports: [AuthModule], }) export class AuthConfigModule {}

That’s all the configuration required.

Links:

https://github.com/panva/node-oidc-provider

https://github.com/damienbod/angular-auth-oidc-client


John Philpin : Lifestream

Looks like Boris has thrown in the towel for the UK - even w

Looks like Boris has thrown in the towel for the UK - even with 000s in hospital - at the other end of the Spectrum to ( say ) New Zealand. I just hope that the names of the criminals that are under investigation in NZ are exposed - and if guilty sent to prison - not fined.

Looks like Boris has thrown in the towel for the UK - even with 000s in hospital - at the other end of the Spectrum to ( say ) New Zealand.

I just hope that the names of the criminals that are under investigation in NZ are exposed - and if guilty sent to prison - not fined.

Saturday, 11. September 2021

Here's Tom with the Weather

Mont Sainte Anne

I enjoyed hiking the “Le Sentier des Pionniers” trail at Mont Sainte Anne today. Definitely ordering pizza tonight.

I enjoyed hiking the “Le Sentier des Pionniers” trail at Mont Sainte Anne today. Definitely ordering pizza tonight.

Friday, 10. September 2021

Moxy Tongue

Not Moxie Marlinspike

Oft confused, no more. https://github.com/lifewithalacrity/lifewithalacrity.github.io/commit/52c30ec1d649494066c3e9c9fa1bbaf95cd6386f  https://github.com/lifewithalacrity/lifewithalacrity.github.io/commit/d7252be02cb351368c2c1bb00c66ad8d15ef5e21 Self-Sovereign Identity has deep roots. It did not just emerge in 2016 after a blog post was written. It did not fail to exist when wikipedia edit

Oft confused, no more.

https://github.com/lifewithalacrity/lifewithalacrity.github.io/commit/52c30ec1d649494066c3e9c9fa1bbaf95cd6386f 

https://github.com/lifewithalacrity/lifewithalacrity.github.io/commit/d7252be02cb351368c2c1bb00c66ad8d15ef5e21

Self-Sovereign Identity has deep roots. It did not just emerge in 2016 after a blog post was written. It did not fail to exist when wikipedia editors denied it subject integrity with the stated message: "good luck with that".

Self-Sovereign Identity is a structural result of accurate Sovereign source authority, expressive by Individuals. People, using tools, can do many amazing things.

The social telling of information: http://www.lifewithalacrity.com/2016/04/the-path-to-self-soverereign-identity.html constructed to socialize access to methods being advanced by social groups of one form or another, can do nothing to alter the root ownership and expression of "self-sovereign rights". Capturing a social opportunity, or servicing a social service deployed for public benefit, has structural requirements. The "proof is in the pudding" is a digital data result now in the real world. Data literacy is a functional requirement of baseline participation in a "civil society". Root Sovereignty is a design outcome with living data results that must be evaluated for accuracy & integrity. Functional literacy is a requirement for system integrity.

Words carry meaning; literary abstractions of often real and digital objects, methods, processes, ideas. When a word's meaning is not accurate, or not accurate enough, language is negatively affected, and as a result, communication of integrity between people is affected. People, Individuals all, living among one another, is the only actual reality of human existence in the Universe. This structural requirement for accuracy can only translate accurately enough by using the right words. "We the people", if not structurally referring to the Individuals walking in local communities with blood running through their veins, but instead translated via legalese into a legal abstraction giving force to Government administration methods, represents a perversion of human intent.

While amendments have been offered to fix past perversions of human intent and accurate use of language, the root error of omission providing cover for the mis-translation of "We The People" in the United States of America in 2021 is not a construct of words, or literary amendments. Instead, it is a restructuring of baseline participation in the administration of human authority within a Sovereign territory. "We The People", Individuals All, give administrative integrity to the Government derived "of, by, for" our consenting authority. Our Sovereign source authority, represented as Individuals, by our own self-Sovereign identity is the means by which this Nation came into existence, and the only means by which it continues.

America can not be provisioned from a database. People own root authority, personally, or not at all.








Simon Willison

Quoting Matt Levine

Imagine writing the investment memo for “20% of a picture of a dog” and being like “the most we should pay is probably about $2 million because the whole picture of the dog sold for $4 million three months ago and it can’t realistically have appreciated more than 150% since then; even if the whole picture of the dog is worth, aggressively, $10 million, this share would be worth $2 milllion.” What

Imagine writing the investment memo for “20% of a picture of a dog” and being like “the most we should pay is probably about $2 million because the whole picture of the dog sold for $4 million three months ago and it can’t realistically have appreciated more than 150% since then; even if the whole picture of the dog is worth, aggressively, $10 million, this share would be worth $2 milllion.” What nonsense that is!

Matt Levine


Jon Udell

Query like it’s 2022

Monday will be my first day as community lead for Steampipe, a young open source project that normalizes APIs by way of Postgres foreign data wrappers. The project’s taglines are select * from cloud and query like it’s 1992; the steampipe.io home page nicely illustrates these ideas. I’ve been thinking about API normalization for a … Continue reading Query like it’s 2022

Monday will be my first day as community lead for Steampipe, a young open source project that normalizes APIs by way of Postgres foreign data wrappers. The project’s taglines are select * from cloud and query like it’s 1992; the steampipe.io home page nicely illustrates these ideas.

I’ve been thinking about API normalization for a long time. The original proposal for the World Wide Web says:

Databases

A generic tool could perhaps be made to allow any database which uses a commercial DBMS to be displayed as a hypertext view.

We ended up with standard ways for talking to databases — ODBC, JDBC — but not for expressing them on the web.

When I was at Microsoft I was bullish on OData, an outgrowth of Pablo Castro’s wonderful Project Astoria. Part of the promise was that every database-backed website could automatically offer basic API access that wouldn’t require API wrappers for everybody’s favorite programming language. The API was hypertext; a person could navigate it using links and search. Programs wrapped around that API could be useful, but meaningful interaction with data would be possible without them.

(For a great example of what that can feel like, jump into the middle of one of Simon Willison’s datasettes, for example san-francisco.datasettes.com, and start clicking clicking around.)

Back then I wrote a couple of posts on this topic[1, 2]. Many years later OData still hasn’t taken the world by storm. I still think it’s a great idea and would love to see it, or something like it, catch on more broadly. Meanwhile Steampipe takes a different approach. Given a proliferation of APIs and programming aids for them, let’s help by providing a unifying abstraction: SQL.

I’ve done a deep dive into the SQL world over the past few years. The first post in a series I’ve been writing on my adventures with Postgres is what connected me to Steampipe and its sponsor (my new employer) Turbot. When you install Steampipe it brings Postgres along for the ride. Imagine what you could do with data flowing into Postgres from many different APIs and filling up tables you can view, query, join, and expose to tools and systems that talk to Postgres. Well, it’s going to be my job to help imagine, and explain, what’s possible in that scenario.

Meanwhile I need to give some thought to my Twitter tag line: patron saint of trailing edge technologies. It’s funny and it’s true. At BYTE I explored how software based on the Net News Transfer Protocol enabled my team to do things that we use Slack for today. At Microsoft I built a system for community-scale calendaring based on iCalendar. When I picked up NNTP and iCalendar they were already on the trailing edge. Yet they were, and especially in the case of iCalendar still are, capable of doing much more than is commonly understood.

Then of course came web annotation. Although Hypothesis recently shepherded it to W3C standardization it goes all the way back to the Mosaic browser and is exactly the kind of generative tech that fires my imagination. With Hypothesis now well established in education, I hope others will continue to explore the breadth of what’s possible when every document workflow that needs to can readily connect people, activities, and data to selections in documents. If that’s of interest, here are some signposts pointing to scenarios I’ve envisioned and prototyped.

And now it’s SQL. For a long time I set it aside in favor of object, XML, and NoSQL stores. Coming back to it, by way of Postgres, has shown me that:

– Modern SQL is more valuable as a programming language than is commonly understood

– So is Postgres as a programming environment

The tagline query like it’s 1992 seems very on-brand for me. But maybe I should let go of the trailing-edge moniker. Nostalgia isn’t the best way to motivate fresh energy. Maybe query like it’s 2022 sets a better tone? In any case I’m very much looking forward to this next phase.

Thursday, 09. September 2021

Doc Searls Weblog

The Matrix 4.0

The original Matrix is my favorite movie. Not because I think it’s the best. I just think it’s the most important. Also among the most rewatchable. (Hear that, Ringer? Rewatch the whole series before Christmas.) And now the fourth movie in the series is coming out: The Matrix Resurrections. Here’s the @TheMatrixMovie‘s new pinned tweet […]

The original Matrix is my favorite movie. Not because I think it’s the best. I just think it’s the most important. Also among the most rewatchable. (Hear that, Ringer? Rewatch the whole series before Christmas.)

And now the fourth movie in the series is coming out: The Matrix Resurrections. Here’s the @TheMatrixMovie‘s new pinned tweet of the first trailer.

Yeah, it’s a sequel, and sequels tend to sag. Even The Godfather Part 2. (But that one only sagged in the relative sense, since the original was perfect.)

If anything bothers me about this next Matrix it’s that what was seemed an untouchable Classic is now a Franchise. Not a bad beast, the Franchise. Just different: same genus, different species.

Given the way these things go, my expectations are low and my hopes high.

Meanwhile, I’m wondering why Laurence Fishburn, Hugo Weaving, and Lilly Wachowski don’t return in Resurrections. Not being critical here. Just curious.

Bonus link: a must-see from 2014.

Also, from my old blog in 2003,

William Blaze has an interesting take on the political agenda of The Matrix Franchise.

My own thoughts about the original Matrix (that it was a metaphor for marketing, basically) are here, here and here.†

That was back when blogging was blogging. Which it will be again, at least for some of us, when Dave Winer is finished rebooting the practice with Drummer.

† I know those two links are duplicates, but don’t have the time to hunt down the originals. And Google is no help, because it ignores lots of old material.


Is there a way out of password hell?

Passwords are hell. Worse, to make your hundreds of passwords safe as possible, they should be nearly impossible for others to discover—and for you to remember. Unless you’re a wizard, this all but requires using a password manager.† Think about how hard that job is. First, it’s impossible for developers of password managers to do […]

Passwords are hell.

Worse, to make your hundreds of passwords safe as possible, they should be nearly impossible for others to discover—and for you to remember.

Unless you’re a wizard, this all but requires using a password manager.†

Think about how hard that job is. First, it’s impossible for developers of password managers to do everything right:

Most of their customers and users need to have logins and passwords for hundreds of sites and services on the Web and elsewhere in the networked world Every one of those sites and services has its own gauntlet of methods for registering logins and passwords, and for remembering and changing them Every one of those sites and services has its own unique user interfaces, each with its own peculiarities All of those UIs change, sometimes often.

Keeping up with that mess while also keeping personal data safe from both user error and determined bad actors, is about as tall as an order can get. And then you have to do all that work for each of the millions of customers you’ll need if you’re going to make the kind of money required to keep abreast of those problems and providing the solutions required.

So here’s the thing: the best we can do with passwords is the best that password managers can do. That’s your horizon right there.

Unless we can get past logins and passwords somehow.

And I don’t think we can. Not in the client-server ecosystem that the Web has become, and that industry never stopped being, since long before the Internet came along. That’s the real hell. Passwords are just a symptom.

We need to work around it. That’s my work now. Stay tuned here, here, and here for more on that.

† We need to fix that Wikipedia page.

Wednesday, 08. September 2021

Simon Willison

Datasette Desktop - a macOS desktop application for Datasette

I just released version 0.1.0 of the new Datasette macOS desktop application, the first version that end-users can easily install. I would very much appreciate your help testing it out! Datasette Desktop Datasette is "an open source multi-tool for exploring and publishing data". It's a Python web application that lets you explore data held in SQLite databases, plus a growing ecosystem of pl

I just released version 0.1.0 of the new Datasette macOS desktop application, the first version that end-users can easily install. I would very much appreciate your help testing it out!

Datasette Desktop

Datasette is "an open source multi-tool for exploring and publishing data". It's a Python web application that lets you explore data held in SQLite databases, plus a growing ecosystem of plugins for visualizing and manipulating those databases.

Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to explore and share with the world.

There's just one big catch: since it's a Python web application, those users have needed to figure out how to install and run Python software in order to use it. For people who don't live and breath Python and the command-line this turns out to be a substantial barrier to entry!

Datasette Desktop is my latest attempt at addressing this problem. I've packaged up Datasette, SQLite and a full copy of Python such that users can download and uncompress a zip file, drag it into their /Applications folder and start using Datasette, without needing to know that there's a Python web server running under the hood (or even understand what a Python web server is).

Please try it out, and send me feedback and suggestions on GitHub.

What the app does

This initial release has a small but useful set of features:

Open an existing SQLite database file and offer all of Datasette's functionality, including the ability to explore tables and to execute arbitrary SQL queries. Open a CSV file and offer the Datasette table interface (example here). By default this uses an in-memory database that gets cleared when the app shuts down, or you can... Import CSV files into tables in on-disk SQLite databases (including creating a new blank database first). By default the application runs a local web server which only accepts connections from your machine... but you can change that in the "File -> Access Control" menu to allow connections from anyone on your network. This includes Tailscale networks too, allowing you to run the application on your home computer and then access it securely from other devices such as your mobile phone anywhere in the world. You can install plugins! This is the most exciting aspect of this initial release: it's already in a state where users can customize it and developers can extend it, either with Datasette's existing plugins (69 and counting) or by writing new ones. How the app works

There are three components to the app:

A macOS wrapper application Datasette itself The datasette-app-support plugin

The first is the macOS application itself. This is currently written with Electron, and bundles a full copy of Python 3.9 (based on python-build-standalone by Gregory Szorc). Bundling Python is essential: the principal goal of the app is to allow people to use Datasette who aren't ready to figure out how to install their own Python environment. Having an isolated and self-contained Python is also a great way of avoiding making XKCD 1987 even worse.

The macOS application doesn't actually include Datasette itself. Instead, on first launch it creates a new Python virtual environment (currently in ~/.datasette-app/venv, feedback on that location welcome) and installs the other two components: Datasette and the datasette-app-support plugin.

Having a dedicated virtual environment is what enables the "Install Plugin" menu option. When a plugin is installed the macOS application runs pip install name-of-plugin and then restarts the Datasette server process, causing it to load that new plugin.

The datasette-app-support plugin is designed exclusively to work with this application. It adds API endpoints that the Electron shell can use to trigger specific actions, such as "import from this CSV file" or "attach this SQLite database" - these are generally triggered by macOS application menu items.

It also adds a custom authentication mechanism. The user of the app should have special permissions: only they should be able to import a CSV file from anywhere on their computer into Datasette. But for the "network share" feature I want other users to be able to access the web application.

An interesting consequence of installing Datasette on first-run rather than bundling it with the application is that the user will be able to upgrade to future Datasette releases without needing to re-install the application itself.

How I built it

I've been building this application completely in public over the past two weeks, writing up my notes and research in GitHub issues as I went (here's the initial release milestone).

I had to figure out a lot of stuff!

First, Electron. Since almost all of the user-facing interface is provided by the existing Datasette web application, Electron was a natural fit: I needed help powering native menus and bundling everything up as an installable application, which Electron handles extremely well.

I also have ambitions to get a Windows version working in the future, which should share almost all of the same code.

Electron also has fantastic initial developer onboarding. I'd love to achieve a similar level of quality for Datasette some day.

The single biggest challenge was figuring out how to bundle a working copy of the Datasette Python application to run inside the Electron application.

My initial plan (touched on last week) was to compile Datasette and its dependencies into a single executable using PyInstaller or PyOxidizer or py2app.

These tools strip down a Python application to the minimal required set of dependencies and then use various tricks to compress that all into a single binary. They are really clever. For many projects I imagine this would be the right way to go.

I had one big problem though: I wanted to support plugin installation. Datasette plugins can have their own dependencies, and could potentially use any of the code from the Python standard library. This means that a stripped-down Python isn't actually right for this project: I need a full installation, standard library and all.

Telling the user they had to install Python themselves was an absolute non-starter: the entire point of this project is to make Datasette available to users who are unwilling or unable to jump through those hoops.

Gregory Szorc built PyOxidizer, and as part of that he built python-build-standalone:

This project produces self-contained, highly-portable Python distributions. These Python distributions contain a fully-usable, full-featured Python installation as well as their build artifacts (object files, libraries, etc).

Sounds like exactly what I needed! I opened a research issue, built a proof-of-concept and decided to commit to that as the approach I was going to use. Here's a TIL that describes how I'm doing this: Bundling Python inside an Electron app

(I find GitHub issue threads to be the ideal way of exploring these kinds of areas. Many of my repositories have a research label specifically to track them.)

The last key step was figuring out how to sign the application, so I could distribute it to other macOS users without them facing this dreaded dialog:

It turns out there are two steps to this these days: signing the code with a developer certificate, and then "notarizing" it, which involves uploading the bundle to Apple's servers, having them scan it for malicious code and attaching the resulting approval to the bundle.

I was expecting figuring this out to be a nightmare. It ended up not too bad: I spent two days on it, but most of the work ended up being done by electron-builder - one of the biggest advantages of working within the Electron ecosystem is that a lot of people have put a lot of effort into these final steps.

I was adamant that my eventual signing and notarization solution should be automated using GitHub Actions: nothing defangs a frustrating build process more than good automation! This made things a bit harder because all of the tutorials and documentation assumed you were working with a GUI, but I got there in the end. I wrote this all up as a TIL: Signing and notarizing an Electron app for distribution using GitHub Actions (see also Attaching a generated file to a GitHub release using Actions).

What's next

I announced the release last night on Twitter and I've already started getting feedback. This has resulted in a growing number of issues under the usability label.

My expectation is that most improvements made for the benefit of Datasette Desktop will benefit the regular Datasette web application too.

There's also a strategic component to this. I'm investing a lot of development work in Datasette, and I want that work to have the biggest impact possible. Datasette Desktop is an important new distribution channel, which also means that any time I add a new feature to Datasette or build a new plugin the desktop application should see the same benefit as the hosted web application.

If I'm unlucky I'll find this slows me down: every feature I build will need to include consideration as to how it affects the desktop application.

My intuition currently is that this trade-off will be worthwhile: I don't think ensuring desktop compatibility will be a significant burden, and the added value from getting new features almost for free through a whole separate distribution channel should hopefully be huge!

TIL this week Calculating the AQI based on the Purple Air API for a sensor Using the Chrome DevTools console as a REPL for an Electron app Open external links in an Electron app using the system browser Attaching a generated file to a GitHub release using Actions Signing and notarizing an Electron app for distribution using GitHub Actions Bundling Python inside an Electron app Releases this week datasette-import-table: 0.3 - (6 releases total) - 2021-09-08
Datasette plugin for importing tables from other Datasette instances datasette-app: Datasette Desktop 0.1.0 - 2021-09-08
Electron app wrapping Datasette datasette-app-support: 0.6 - (8 releases total) - 2021-09-07
Part of https://github.com/simonw/datasette-app pids: 0.1.2 - 2021-09-07
A tiny Python library for generating public IDs from integers

Phil Windley's Technometria

Fluid Multi-Pseudonymity

Summary: Fluid multi-pseudonymity perfectly describes the way we live our lives and the reality that identity systems must realize if we are to live authentically in the digital sphere. In response to my recent post on Ephemeral Relationships, Emil Sotirov tweeted that this was an example of "fluid multi-pseudonymity as the norm." I love that phrase because it succinctly describes s

Summary: Fluid multi-pseudonymity perfectly describes the way we live our lives and the reality that identity systems must realize if we are to live authentically in the digital sphere.

In response to my recent post on Ephemeral Relationships, Emil Sotirov tweeted that this was an example of "fluid multi-pseudonymity as the norm." I love that phrase because it succinctly describes something I've been trying to explain for years.

Emil was riffing on this article in Aeon You are a network that says "Selves are not only 'networked', that is, in social networks, but are themselves networks." I've never been a fan of philosophical introspections in digital identity discussions. I just don't think they often lead to useful insights. Rather, I like what Joe Andrieu calls functional identity: Identity is how we recognize, remember, and ultimately respond to specific people and things. But this insight, that we are multiple selves, changing over time—even in the course of a day—is powerful. And as Emil points out, our real-life ephemeral relationships are an example of this fluid multi-pseudonymity.

The architectures of traditional, administrative identity systems do not reflect the fluid multi-pseudonymity of real life and consequently are mismatched to how people actually live. I frequently see calls for someone, usually a government, to solve the online identity problem by issuing everyone a permanent "identity." I put that in quotes because I hate when we use the word "identity" in that way—as if everyone has just one and once we link every body (literally) to some government issued identifier and a small number of attributes all our problems will disappear.

These calls don't often come from within the identity community. Identity professionals understand how hard this problem is and that there's no single identity for anyone. But even identity professionals use the word "identity" when they mean "account." I frequently make an ass of myself my pointing that out. I get invited to fewer meetings that way. The point is this: there is no "identity." And we don't build identity systems to manage identities (whatever those are), but, rather, relationships.

All of us, in real life and online, have multiple relationships. Many of those are pseudonymous. Many are ephemeral. But even a relationship that starts pseudonymous and ephemeral can develop into something permanent and better defined over time. Any relationship we have, even those with online services, changes over time. In short, our relationships are fluid and each is different.

Self-sovereign identity excites me because, for the first time, we have a model for online identity that can flexibly support fluid multi-pseudonymity. Decentralized identifiers and verifiable credentials form an identity metasystem capable of being the foundation for any kind of relationship: ephemeral, pseudonymous, ad hoc, permanent, personal, commercial, legal, or anything else. For details on how this all works, see my Frontiers article on the identity metasystem.

An identity metasystem that matches the fluid multi-pseudonymity inherent in how people actually live is vital for personal autonomy and ultimately human rights. Computers are coming to intermediate every aspect of our lives. Our autonomy and freedom as humans depend on how we architect this digital world. Unless we put digital systems under the control of the individuals they serve without intervening administrative authorities and make them as flexible as our real-lives demand, the internet will undermine the quality of life it is meant to bolster. The identity metasystem is the foundation for doing that.

Photo Credit: Epupa Falls from Travel Trip Journey (none)

Tags: identity relationships ssi pseudonymity


Simon Willison

Datasette Desktop 0.1.0

Datasette Desktop 0.1.0 This is the first installable version of the new Datasette Desktop macOS application I've been building. Please try it out and leave feedback on Twitter or on the GitHub Discussions thread linked from the release notes. Via @simonw

Datasette Desktop 0.1.0

This is the first installable version of the new Datasette Desktop macOS application I've been building. Please try it out and leave feedback on Twitter or on the GitHub Discussions thread linked from the release notes.

Via @simonw

Tuesday, 07. September 2021

Jon Udell

The Postgres REPL

R0ml Lefkowitz’s The Image of Postgres evokes the Smalltalk experience: reach deeply into a running system, make small changes, see immediate results. There isn’t yet a fullblown IDE for the style of Postgres-based development I describe in this series, though I can envision a VSCode extension that would provide one. But there is certainly a … Continue reading The Postgres REPL

R0ml Lefkowitz’s The Image of Postgres evokes the Smalltalk experience: reach deeply into a running system, make small changes, see immediate results. There isn’t yet a fullblown IDE for the style of Postgres-based development I describe in this series, though I can envision a VSCode extension that would provide one. But there is certainly a REPL (read-eval-print loop), it’s called psql, and it delivers the kind of immediacy that all REPLs do. In our case there’s also Metabase; it offers a complementary REPL that enhances its power as a lightweight app server.

In the Clojure docs it says:

The Clojure REPL gives the programmer an interactive development experience. When developing new functionality, it enables her to build programs first by performing small tasks manually, as if she were the computer, then gradually make them more and more automated, until the desired functionality is fully programmed. When debugging, the REPL makes the execution of her programs feel tangible: it enables the programmer to rapidly reproduce the problem, observe its symptoms closely, then improvise experiments to rapidly narrow down the cause of the bug and iterate towards a fix.

I feel the same way about the Python REPL, the browser’s REPL, the Metabase REPL, and now also the Postgres REPL. Every function and every materialized view in the analytics system begins as a snippet of code pasted into the psql console (or Metabase). Iteration yields successive results instantly, and those results reflect live data. In How is a Programmer Like a Pathologist Gilad Bracha wrote:

A live program is dynamic; it changes over time; it is animated. A program is alive when it’s running. When you work on a program in a text editor, it is dead.

Tudor Girba amplified the point in a tweet.

In a database-backed system there’s no more direct way to interact with live data than to do so in the database. The Postgres REPL is, of course, a very sharp tool. Here are some ways to handle it carefully.

Find the right balance for tracking incremental change

In Working in a hybrid Metabase / Postgres code base I described how version-controlled files — for Postgres functions and views, and for Metabase questions — repose in GitHub and drive a concordance of docs. I sometimes write code snippets directly in psql or Metabase, but mainly compose in a “repository” (telling word!) where those snippets are “dead” artifacts in a text editor. They come to life when pasted into psql.

A knock on Smalltalk was that it didn’t play nicely with version control. If you focus on the REPL aspect, you could say the same of Python or JavaScript. In any such case there’s a balance to be struck between iterating at the speed of thought and tracking incremental change. Working solo I’ve been inclined toward a fairly granular commit history. In a team context I’d want to leave a chunkier history but still record the ongoing narrative somewhere.

Make it easy to understand the scope and effects of changes

The doc concordance has been the main way I visualize interdependent Postgres functions, Postgres views, and Metabase questions. In Working with interdependent Postgres functions and materialized views I mentioned Laurenz Albe’s Tracking View Dependencies in Postgres. I’ve adapted the view dependency tracker he develops there, and adapted related work from others to track function dependencies.

This tooling is still a work in progress, though. The concordance doesn’t yet include Postgres types, for example, nor the tables that are upstream from materialized views. My hypothetical VSCode extension would know about all the artifacts and react immediately when things change.

Make it easy to find and discard unwanted artifacts

Given a function or view named foo, I’ll often write and test a foo2 before transplanting changes back into foo. Because foo may often depend on bar and call baz I wind up also with bar2 and baz2. These artifacts hang around in Postgres until you delete them, which I try to do as I go along.

If foo2 is a memoized function (see this episode), it can be necessary to delete the set of views that it’s going to recreate. I find these with a query.

select 'drop materialized view ' || matviewname || ';' as drop_stmt from pg_matviews where matviewname ~* {{ pattern }}

That pattern might be question_and_answer_summary_for_group to find all views based on that function, or _6djxg2yk to find all views for a group, or even [^_]{8,8}$ to find all views made by memoized functions.

I haven’t yet automated the discovery or removal of stale artifacts and references to them. That’s another nice-to-have for the hypothetical IDE.

The Image of Postgres

I’ll give R0ml the last word on this topic.

This is the BYTE magazine cover from August of 1981. In the 70s and the 80s, programming languages had this sort of unique perspective that’s completely lost to history. The way it worked: a programming environment was a virtual machine image, it was a complete copy of your entire virtual machine memory and that was called the image. And then you loaded that up and it had all your functions and your data in it, and then you ran that for a while until you were sort of done and then you saved it out. And this wasn’t just Smalltalk, Lisp worked that way, APL worked that way, it was kind of like Docker only it wasn’t a separate thing because everything worked that way and so you didn’t worry very much about persistence because it was implied. If you had a programming environment it saved everything that you were doing in the programming environment, you didn’t have to separate that part out. A programming environment was a place where you kept all your data and business logic forever.

So then Postgres is kind of like Smalltalk only different.

What’s the difference? Well we took the UI out of Smalltalk and put it in the browser. The rest of it is the same, so really Postgres is an application delivery platform, just like we had back in the 80s.


1 https://blog.jonudell.net/2021/07/21/a-virtuous-cycle-for-analytics/
2 https://blog.jonudell.net/2021/07/24/pl-pgsql-versus-pl-python-heres-why-im-using-both-to-write-postgres-functions/
3 https://blog.jonudell.net/2021/07/27/working-with-postgres-types/
4 https://blog.jonudell.net/2021/08/05/the-tao-of-unicode-sparklines/
5 https://blog.jonudell.net/2021/08/13/pl-python-metaprogramming/
6 https://blog.jonudell.net/2021/08/15/postgres-and-json-finding-document-hotspots-part-1/
7 https://blog.jonudell.net/2021/08/19/postgres-set-returning-functions-that-self-memoize-as-materialized-views/
8 https://blog.jonudell.net/2021/08/21/postgres-functional-style/
9 https://blog.jonudell.net/2021/08/26/working-in-a-hybrid-metabase-postgres-code-base/
10 https://blog.jonudell.net/2021/08/28/working-with-interdependent-postgres-functions-and-materialized-views/
11 https://blog.jonudell.net/2021/09/05/metabase-as-a-lightweight-app-server/
12 https://blog.jonudell.net/2021/09/07/the-postgres-repl/


Moxy Tongue

Bitcoin: Founded By Sovereign Source Authority

No Permission Required.  Zero-Trust Infrastructure. A ghost has entered the machine of Society, there is no turning back.

No Permission Required. 

Zero-Trust Infrastructure.

A ghost has entered the machine of Society, there is no turning back.





Simon Willison

Quoting Solomon Hykes

We never shipped a great commercial product. The reason for that is we didn’t focus. We tried to do a little bit of everything. It’s hard enough to maintain the growth of your developer community and build one great commercial product, let alone three or four, and it is impossible to do both, but that’s what we tried to do and we spent an enormous amount of money doing it. — Solomon Hykes

We never shipped a great commercial product. The reason for that is we didn’t focus. We tried to do a little bit of everything. It’s hard enough to maintain the growth of your developer community and build one great commercial product, let alone three or four, and it is impossible to do both, but that’s what we tried to do and we spent an enormous amount of money doing it.

Solomon Hykes


MyDigitalFootprint

Plotting ROI, and other measures for gauging performance on Peak Paradox

The purpose of this post is to plot where some (there is way too many to do them all) different investment measures align on the Peak Paradox model. It is not to explain in detail what all the measures means and their corresponding strength and weaknesses.  This is a good article if you want the latter for the pure financial ones. Key ROI, Return on Investment. IRR Internal Rate of Re

The purpose of this post is to plot where some (there is way too many to do them all) different investment measures align on the Peak Paradox model. It is not to explain in detail what all the measures means and their corresponding strength and weaknesses.  This is a good article if you want the latter for the pure financial ones.


Key

ROI, Return on Investment. IRR Internal Rate of Return.  RI Residual income. ROE Return on Equity. ROA Return on assets. ROCE  Return on Capital Employed.  ROT Return on Time. IR Impact return.  SV  Social Value.  AR Asset Returns. PER Portfolio Expected Return. SROI Social return on investment. 

The observation is that we have not developed with any level of sophistication the same ability to measure or report on anything outside of finance, which we call “hard.”   By calling other important aspects of a decision “soft” we have framed them as less important and harder to agree on. 


Monday, 06. September 2021

Simon Willison

Making world-class docs takes effort

Making world-class docs takes effort Curl maintainer Daniel Stenberg writes about his principles for good documentation. I agree with all of these: he emphasizes keeping docs in the repo, avoiding the temptation to exclusively generate them from code, featuring examples and ensuring every API you provide has documentation. Daniel describes an approach similar to the documentation unit tests I've

Making world-class docs takes effort

Curl maintainer Daniel Stenberg writes about his principles for good documentation. I agree with all of these: he emphasizes keeping docs in the repo, avoiding the temptation to exclusively generate them from code, featuring examples and ensuring every API you provide has documentation. Daniel describes an approach similar to the documentation unit tests I've been using for my own projects: he has scripts which scan the curl documentation to ensure not only that everything is documented but that each documentation area contains the same sections in the same order.

Via Hacker News


Hyperonomy Digital Identity Lab

Verifiable Credentials Guide for Developer: Call for Participation

Want to contribute to the World Wide Web Consortium (W3C) Developers Guide for Verifiable Credentials? W3C is an international community that develops open standards to ensure the long-term growth of the Web. A new W3C Community Note Work Item Proposal … Continue reading →

Want to contribute to the World Wide Web Consortium (W3C) Developers Guide for Verifiable Credentials?

W3C is an international community that develops open standards to ensure the long-term growth of the Web.

A new W3C Community Note Work Item Proposal entitled Verifiable Credentials Guide for Developers has been submitted and you can help create it.

I want to invite everyone interested #DigitalIdentity, #DecentralizedIdentity, #VerifiableCredentials, #TrustOnTheInternet, and/or #SecureInternetStorage to join this key group of people who will be defining and creating the W3C Verifiable Credentials Guide for Developers.

Please contact me directly or post an email to public-credentials@w3.org

Links

Draft W3C Community Note: https://t.co/veg349grR9 Work Item: Verifiable Credentials Guide for Developers (VC-GUIDE-DEVELOPERS) https://t.co/LziMaeYskG GitHub: https://t.co/ptqaUA6IyC

Damien Bod

Using Azure security groups in ASP.NET Core with an Azure B2C Identity Provider

This article shows how to implement authorization in an ASP.NET Core application which uses Azure security groups for the user definitions and Azure B2C to authenticate. Microsoft Graph API is used to access the Azure group definitions for the signed in user. The client credentials flow is used to authorize the Graph API client with […]

This article shows how to implement authorization in an ASP.NET Core application which uses Azure security groups for the user definitions and Azure B2C to authenticate. Microsoft Graph API is used to access the Azure group definitions for the signed in user. The client credentials flow is used to authorize the Graph API client with an application scope definition. This is not optimal, the delegated user flows would be better. By allowing applications rights for the defined scopes using Graph API, you are implicitly making the application an administrator of the tenant as well for the defined scopes.

Code: https://github.com/damienbod/azureb2c-fed-azuread

Two Azure AD security groups were created to demonstrate this feature with Azure B2C authentication. The users were added to the admin group and the user group as required. The ASP.NET Core application uses an ASP.NET Core Razor page which should only be used by admin users, i.e. people in the group. To validate this in the application, Microsoft Graph API is used to get groups for the signed in user and an ASP.NET Core handler, requirement and policy uses the group claim created from the Azure group to force the authorization.

The groups are defined in the same tenant as the Azure B2C.

A separate Azure App registration is used to define the application Graph API scopes. The User.Read.All application scope is used. In the demo, a client secret is used, but a certificate can also be used to access the API.

The Microsoft.Graph Nuget package is used as a client for Graph API.

<PackageReference Include="Microsoft.Graph" Version="4.4.0" />

The GraphApiClientService class implements the Microsoft Graph API client. A ClientSecretCredential instance is used as the AuthProvider and the definitions for the client are read from the application configurations and the user secrets in development, or Azure Key Vault. The user-id from the name identifier claim is used to get the Azure groups for the signed-in user. The claim namespaces gets added using the Microsoft client, this can be deactivated if required. I usually use the default claim names but as the is an Azure IDP, I left the Microsoft defaults which adds the extra stuff to the claims. The Graph API GetMemberGroups method returns the group IDs for the signed in identity.

using Azure.Identity; using Microsoft.Extensions.Configuration; using Microsoft.Graph; using System.Threading.Tasks; namespace AzureB2CUI.Services { public class GraphApiClientService { private readonly GraphServiceClient _graphServiceClient; public GraphApiClientService(IConfiguration configuration) { string[] scopes = configuration.GetValue<string>("GraphApi:Scopes")?.Split(' '); var tenantId = configuration.GetValue<string>("GraphApi:TenantId"); // Values from app registration var clientId = configuration.GetValue<string>("GraphApi:ClientId"); var clientSecret = configuration.GetValue<string>("GraphApi:ClientSecret"); var options = new TokenCredentialOptions { AuthorityHost = AzureAuthorityHosts.AzurePublicCloud }; // https://docs.microsoft.com/dotnet/api/azure.identity.clientsecretcredential var clientSecretCredential = new ClientSecretCredential( tenantId, clientId, clientSecret, options); _graphServiceClient = new GraphServiceClient(clientSecretCredential, scopes); } public async Task<IDirectoryObjectGetMemberGroupsCollectionPage> GetGraphApiUserMemberGroups(string userId) { var securityEnabledOnly = true; return await _graphServiceClient.Users[userId] .GetMemberGroups(securityEnabledOnly) .Request().PostAsync() .ConfigureAwait(false); } } }

The .default scope is used to access the Graph API using the client credential client.

"GraphApi": { "TenantId": "f611d805-cf72-446f-9a7f-68f2746e4724", "ClientId": "1d171c13-236d-4c2b-ac10-0325be2cbc74", "Scopes": ".default" //"ClientSecret": "--in-user-settings--" },

The user and the application are authenticated using Azure B2C and an Azure App registration. Using Azure B2C, only a certain set of claims can be returned which cannot be adapted easily. Once signed-in, we want to include the Azure security group claims in the claims principal. To do this, the Graph API is used to find the claims for the user and add the claims to the claims principal using the IClaimsTransformation implementation. This is where the GraphApiClientService is used.

using AzureB2CUI.Services; using Microsoft.AspNetCore.Authentication; using System.Linq; using System.Security.Claims; using System.Threading.Tasks; namespace AzureB2CUI { public class GraphApiClaimsTransformation : IClaimsTransformation { private GraphApiClientService _graphApiClientService; public GraphApiClaimsTransformation(GraphApiClientService graphApiClientService) { _graphApiClientService = graphApiClientService; } public async Task<ClaimsPrincipal> TransformAsync(ClaimsPrincipal principal) { ClaimsIdentity claimsIdentity = new ClaimsIdentity(); var groupClaimType = "group"; if (!principal.HasClaim(claim => claim.Type == groupClaimType)) { var nameidentifierClaimType = "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier"; var nameidentifier = principal.Claims.FirstOrDefault(t => t.Type == nameidentifierClaimType); var groupIds = await _graphApiClientService.GetGraphApiUserMemberGroups(nameidentifier.Value); foreach (var groupId in groupIds.ToList()) { claimsIdentity.AddClaim(new Claim(groupClaimType, groupId)); } } principal.AddIdentity(claimsIdentity); return principal; }

The startup class adds the services and the authorization definitions for the ASP.NET Core Razor page application. The IsAdminHandlerUsingAzureGroups authorization handler is added and this is used to validate the Azure security group claim.

public void ConfigureServices(IServiceCollection services) { services.AddTransient<AdminApiService>(); services.AddTransient<UserApiService>(); services.AddScoped<GraphApiClientService>(); services.AddTransient<IClaimsTransformation, GraphApiClaimsTransformation>(); services.AddHttpClient(); services.AddOptions(); string[] initialScopes = Configuration.GetValue<string>( "UserApiOne:ScopeForAccessToken")?.Split(' '); services.AddMicrosoftIdentityWebAppAuthentication(Configuration, "AzureAdB2C") .EnableTokenAcquisitionToCallDownstreamApi(initialScopes) .AddInMemoryTokenCaches(); services.AddRazorPages().AddMvcOptions(options => { var policy = new AuthorizationPolicyBuilder() .RequireAuthenticatedUser() .Build(); options.Filters.Add(new AuthorizeFilter(policy)); }).AddMicrosoftIdentityUI(); services.AddSingleton<IAuthorizationHandler, IsAdminHandlerUsingAzureGroups>(); services.AddAuthorization(options => { options.AddPolicy("IsAdminPolicy", policy => { policy.Requirements.Add(new IsAdminRequirement()); }); }); }

The IsAdminHandlerUsingAzureGroups implements the AuthorizationHandler class with the IsAdminRequirement requirement. This handler checks for the administrator group definition from the Azure tenant.

using Microsoft.AspNetCore.Authorization; using Microsoft.Extensions.Configuration; using System; using System.Linq; using System.Threading.Tasks; namespace AzureB2CUI.Authz { public class IsAdminHandlerUsingAzureGroups : AuthorizationHandler<IsAdminRequirement> { private readonly string _adminGroupId; public IsAdminHandlerUsingAzureGroups(IConfiguration configuration) { _adminGroupId = configuration.GetValue<string>("AzureGroups:AdminGroupId"); } protected override Task HandleRequirementAsync(AuthorizationHandlerContext context, IsAdminRequirement requirement) { if (context == null) throw new ArgumentNullException(nameof(context)); if (requirement == null) throw new ArgumentNullException(nameof(requirement)); var claimIdentityprovider = context.User.Claims.FirstOrDefault(t => t.Type == "group" && t.Value == _adminGroupId); if (claimIdentityprovider != null) { context.Succeed(requirement); } return Task.CompletedTask; } } }

The policy for this can be used anywhere in the application.

[Authorize(Policy = "IsAdminPolicy")] [AuthorizeForScopes(Scopes = new string[] { "https://b2cdamienbod.onmicrosoft.com/5f4e8bb1-3f4e-4fc6-b03c-12169e192cd7/access_as_user" })] public class CallAdminApiModel : PageModel {

If a user tries to call the Razor page which was created for admin users, then an Access denied is returned. Of course, in a real application, the menu for this would also be hidden if the user is not an admin and does not fulfil the policy.

If the user is an admin and a member of the Azure security group, the data and the Razor page can be opened and viewed.

By using Azure security groups, it is really easily for IT admins to add or remove users from the admin role. This can be easily managed using Powershell scripts. It is a pity that Microsoft Graph API is required to use the Azure security groups when authenticating using Azure B2C. This is much more simple to use when authenticating using Azure AD.

Links

Managing Azure B2C users with Microsoft Graph API

https://docs.microsoft.com/en-us/aspnet/core/blazor/security/webassembly/graph-api

https://docs.microsoft.com/en-us/graph/sdks/choose-authentication-providers?tabs=CS#client-credentials-provider

https://docs.microsoft.com/en-us/azure/active-directory-b2c/overview

https://docs.microsoft.com/en-us/azure/active-directory-b2c/identity-provider-azure-ad-single-tenant

https://github.com/AzureAD/microsoft-identity-web

https://docs.microsoft.com/en-us/azure/active-directory/develop/microsoft-identity-web

https://docs.microsoft.com/en-us/azure/active-directory-b2c/identity-provider-local

https://docs.microsoft.com/en-us/azure/active-directory/

https://docs.microsoft.com/en-us/aspnet/core/security/authentication/azure-ad-b2c

https://github.com/azure-ad-b2c/azureadb2ccommunity.io

https://github.com/azure-ad-b2c/samples

Sunday, 05. September 2021

Jon Udell

Metabase as a lightweight app server

In A virtuous cycle for analytics I said this about Metabase: It’s all nicely RESTful. Interactive elements that can parameterize queries, like search boxes and date pickers, map to URLs. Queries can emit URLs in order to compose themselves with other queries. I came to see this system as a kind of lightweight application server … Continue reading Metabase as a lightweight app server

In A virtuous cycle for analytics I said this about Metabase:

It’s all nicely RESTful. Interactive elements that can parameterize queries, like search boxes and date pickers, map to URLs. Queries can emit URLs in order to compose themselves with other queries. I came to see this system as a kind of lightweight application server in which to incubate an analytics capability that could later be expressed more richly.

Let’s explore that idea in more detail. Consider this query that finds groups created in the last week.

with group_create_days as ( select to_char(created, 'YYYY-MM-DD') as day from "group" where created > now() - interval '1 week' ) select day, count(*) from group_create_days group by day order by day desc

A Metabase user can edit the query and change the interval to, say, 1 month, but there’s a nicer way to enable that. Terms in double squigglies are Metabase variables. When you type {{interval}} in the query editor, the Variables pane appears.

Here I’m defining the variable’s type as text and providing the default value 1 week. The query sent to Postgres will be the same as above. Note that this won’t work if you omit ::interval. Postgres complains: “ERROR: operator does not exist: timestamp with time zone – character varying.” That’s because Metabase doesn’t support variables of type interval as required for date subtraction. But if you cast the variable to type interval it’ll work.

That’s an improvement. A user of this Metabase question can now type 2 months or 1 year to vary the interval. But while Postgres’ interval syntax is fairly intuitive, this approach still requires people to make an intuitive leap. So here’s a version that eliminates the guessing.

The variable type is now Field Filter; the filtered field is the created column of the group table; the widget type is Relative Date; the default is Last Month. Choosing other intervals is now a point-and-click operation. It’s less flexible — 3 weeks is no longer an option — but friendlier.

Metabase commendably provides URLs that capture these choices. The default in this case is METABASE_SERVER/question/1060?interval=lastmonth. For the Last Year option it becomes interval=lastyear.

Because all Metabase questions that use variables work this way, the notion of Metabase as rudimentary app server expands to sets of interlinked questions. In Working in a hybrid Metabase / Postgres code base I showed the following example.

A Metabase question, #600, runs a query that selects columns from the view top_20_annotated_domains_last_week. It interpolates one of those columns, domain, into an URL that invokes Metabase question #985 and passes the domain as a parameter to that question. In the results for question #600, each row contains a link to a question that reports details about groups that annotated pages at that row’s domain.

This is really powerful stuff. Even without all the advanced capabilities I’ve been discussing in this series — pl/python functions, materialized views — you can do a lot more with the Metabase / Postgres combo than you might think.

For example, here’s a interesting idiom I’ve discovered. It’s often useful to interpolate a Metabase variable into a WHERE clause.

select * from dashboard_users where email = {{ email }}

You can make that into a fuzzy search using the case-insensitive regex-match operator ~*.

select * from dashboard_users where email ~* {{ email }}

That’ll find a single address regardless of case; you can also find all records matching, say, ucsc.edu. But it requires the user to type some value into the input box. Ideally this query won’t require any input. If none is given, it lists all addresses in the table. If there is input it does a fuzzy match on that input. Here’s a recipe for doing that. Tell Metabase that {{ email }} a required variable, and set its default to any. Then, in the query, do this:

select * from dashboard_users where email ~* case when {{ email }} = 'any' then '' else {{ email}} end

In the default case the matching operator binds to the empty string, so it matches everything and the query returns all rows. For any other input the operator binds to a value that drives a fuzzy search.

This is all very nice, you may think, but even the simplest app server can write to the database as well as read from it, and Metabase can’t. It’s ultimately just a tool that you point at a data warehouse to SELECT data for display in tables and charts. You can’t INSERT or UPDATE or ALTER or DELETE or CALL anything.

Well, it turns out that you can. Here’s a Metabase question that adds a user to the table.

select add_dashboard_user( {{email}} )

How can this possibly work? If add_dashboard_user were a Postgres procedure you could CALL it from psql, but in this context you can only SELECT.

We’ve seen the solution in Postgres set-returning functions that self-memoize as materialized views. A Postgres function written in pl/python can import and use a Python function from a plpython_helpers module. That helper function can invoke psql to CALL a procedure. So this is possible.

We’ve used Metabase for years. It provides a basic, general-purpose UX that’s deeply woven into the fabric of the company. Until recently we thought of it as a read-only system for analytics, so a lot of data management happens in spreadsheets that don’t connect to the data warehouse. It hadn’t occurred to me to leverage that same basic UX for data management too, and that’s going to be a game-changer. I always thought of Metabase as a lightweight app server. With some help from Postgres it turns out to be a more capable one than I thought.


1 https://blog.jonudell.net/2021/07/21/a-virtuous-cycle-for-analytics/
2 https://blog.jonudell.net/2021/07/24/pl-pgsql-versus-pl-python-heres-why-im-using-both-to-write-postgres-functions/
3 https://blog.jonudell.net/2021/07/27/working-with-postgres-types/
4 https://blog.jonudell.net/2021/08/05/the-tao-of-unicode-sparklines/
5 https://blog.jonudell.net/2021/08/13/pl-python-metaprogramming/
6 https://blog.jonudell.net/2021/08/15/postgres-and-json-finding-document-hotspots-part-1/
7 https://blog.jonudell.net/2021/08/19/postgres-set-returning-functions-that-self-memoize-as-materialized-views/
8 https://blog.jonudell.net/2021/08/21/postgres-functional-style/
9 https://blog.jonudell.net/2021/08/26/working-in-a-hybrid-metabase-postgres-code-base/
10 https://blog.jonudell.net/2021/08/28/working-with-interdependent-postgres-functions-and-materialized-views/
11 https://blog.jonudell.net/2021/09/05/metabase-as-a-lightweight-app-server/
12 https://blog.jonudell.net/2021/09/07/the-postgres-repl/

Saturday, 04. September 2021

reb00ted

Today's apps: context-free computing at its finest

Prompted by this exchange on Twitter: Functionality attached to the context-defining data object. Instead of an “app” silo. Now here’s a thought. :-) — Johannes Ernst (@Johannes_Ernst) September 4, 2021 Let me unpack this a bit. Let’s say that I’d like to send a message to some guy, let’s call him Tom for this example. The messaging app vendors go like: “Oh yes, we’ll make it super-easy fo

Prompted by this exchange on Twitter:

Functionality attached to the context-defining data object. Instead of an “app” silo. Now here’s a thought. :-)

— Johannes Ernst (@Johannes_Ernst) September 4, 2021

Let me unpack this a bit. Let’s say that I’d like to send a message to some guy, let’s call him Tom for this example.

The messaging app vendors go like: “Oh yes, we’ll make it super-easy for him (Johannes) to send a message to Tom, so we give him an extra short handle (say @tom) so he doesn’t need to type much, and also add a ‘reply’ button to previous messages so he won’t even have to type that.”

Which is myopic, because it completely misunderstands or ignores the context of the user. The user doesn’t think that way, at least most of the time.

As a user, I think the pattern is this:

“I need to tell (person) about (news) related to (common context), how do I best do that (i.e. which app)?”

Three concepts and a sub-concept:

Who do I want to tell. In Joshua’s example: the person(s) I’m about to meet with.

What’s the news I want to tell them. Here it is “I am running late”.

But this news only makes sense in the common context of “we agreed to meet today”. If the receiver doesn’t have that context, they will receive my message and it will be pointless and bewildering to them.

What’s the best way of conveying that news. It might be a text, a phone call, or pretty much anything. This item is the least important in this entire scenario, as long as the receiver gets the message.

So thre are two primary “entry” points for this scenario:

It starts with a person. The user thinks “Tom, I need to tell Tom that I’ll be late”, and the frantically tries to find a way of contacting them while hurring in the subway or driving really fast. (We have all been there.)

It starts with the shared context object. The user looks at the calendar and thinks “Oh darn, this meeting, it’s now, I’ll be late, I need to tell them”. (They might not even remember who exactly will be in the meeting, but then likely the calender event has that info.)

The entry point is almost never: “how convenient, I have all people I’m about to meet with in the current window of the messaging app, they all know what whatever I’m going to type into that window next is about the meeting, and I can just simply say that I’ll be late.”

So … if we were to put the user, and their experience, at the center of messaging, messaging wouldn’t be an app. (Well, it might also be an app, but mostly it wouldn’t.)

Instead, the messaging would be a “system service” attached to the context objects in which I’d like to message. In Joshua’s example that is:

the person I’m about to meet with (but this only works if there is a single person; if there are a dozen this does not work).

the shared context about which I am conveying the news: here, the (hopefully shared) calendar event. Joshua’s original point.

Now in my experience, this is just an example for a more general pattern. For example:

meeting notes. In which meeting any meeting notes were taken is of course supremely important. Which is why most meeting minutes start with a title, a date/time and a list of attendees. Instead, they should be attached to the calendar event, just like the message thread. (Yes, including collaborative editing.)

And by “attached” I don’t mean: there’s a URL somewhere in the calendar dialog form that leads to a Google Doc. No, I mean that the calendar can show them, and insert the to-do-items from the meeting, and future meetings from that document, and the other way around: that when you look at the meeting notes, I can see the calendar event, and find out it is a biweekly meeting, and the notes for the other meetings are over there.

projects. Software development is the worst offender, as far as I know. If you and I and half a dozen other people work on a project to implement functionality X, that is our primary shared context. All communication and data creation should occur in that context. But instead we have our code in Github, our bugs in Confluence, our APIs … test cases … screen mockups … video calls … chats … in a gazillion different places, and one of these days I’m sure somebody is going to prove that half of software budgets in many places are consumed by context switching between tools that blissfully ignore that the user thinks not like a vendor does. (And the other half by trying to make the so-called “integrations” between services work.)

There are many more examples. (In a previous life, I ran around with the concept of “situational computing” – which takes context-aware computing to something much more dynamic and extreme; needless to say it was decades before its time. But now with AR/VR-stuff coming, it will probably become part of the mainstream soon.)

The 100 dollar question is of course: if this is the right thing for users, why aren’t apps doing that?

Joshua brought up OpenDoc in a subsequent post. Yep, there is a similarity here, and apps don’t do this kind of thing for the same reason OpenDoc failed. (It didn’t fail because it was slow as molasses and Steve Jobs hated it.)

OpenDoc failed because it would have disintermediated big and powerful companies such as Adobe, who would have had to sell 100 little OpenDoc components, instead of one gigantic monolith containing those 100 components as a take-it-or-leave-it package at “enterprise” pricing. No adoption by Adobe, Adobe feeling Apple proactively attacked their business model, it would have killed Apple right afterwards.

But having OpenDoc would have sooo much better for users. We would have gotten much more innovation, and yes, lower prices. But software vendors, like all businesses, primarily do what is right for them, not their customers.

Which is why we get silos everwhere we look.

If we wanted to change that situation, about OpenDoc-like things, or messaging attached, in context, to things like calendar events, we have to change the economic situation in which the important vendors find themselves.

Plus of course the entire technology stack, because if all you know is how to ship an int main(argv, argc) on your operating system, something componentized and pluggable and user-centric and context-centric is never able to emerge, even if you want to.

(*) The title is meant to be sarcastically, in case you weren’t sure.

Friday, 03. September 2021

Jon Udell

Notes for an annotation SDK

While helping Hypothesis find its way to ed-tech it was my great privilege to explore ways of adapting annotation to other domains including bioscience, journalism, and scholarly publishing. Working across these domains showed me that annotation isn’t just an app you do or don’t adopt. It’s also a service you’d like to be available in … Continue reading Notes for an annotation SDK

While helping Hypothesis find its way to ed-tech it was my great privilege to explore ways of adapting annotation to other domains including bioscience, journalism, and scholarly publishing. Working across these domains showed me that annotation isn’t just an app you do or don’t adopt. It’s also a service you’d like to be available in every document workflow that connects people to selections in documents.

In my talk Weaving the Annotated Web I showcased four such services: Science in the Classroom, The Digital Polarization Project, SciBot, and ClaimChart. Others include tools to evaluate credibility signals, or review claims, in news stories.

As I worked through these and other scenarios, I accreted a set of tools for enabling any annotation-aware interaction in any document-oriented workflow. I’ve wanted to package these as a coherent software development kit, that hasn’t happened yet, but here are some of the ingredients that belong in such an SDK.

Creating an annotation from a selection in a document

Two core operations lie at the heart of any annotation system: creating a note that will bind to a selection in a document, and binding (anchoring) that note to its place in the document. A tool that creates an annotation reacts to a selection in a document by forming one or more selectors that describe the selection.

The most important selector is TextQuoteSelector. If I visit http://www.example.com and select the phrase “illustrative examples” and then use Hypothesis to annotate that selection, the payload sent from the client to the server includes this construct.

{ "type": "TextQuoteSelector", "exact": "illustrative examples", "prefix": "n\n This domain is for use in ", "suffix": " in documents. You may use this\n" }

The Hypothesis client formerly used an NPM module, dom-anchor-text-quote, to derive that info from a selection. It no longer uses that module, and the equivalent code that it does use isn’t separately available. But annotations created using TextQuoteSelectors formed by dom-anchor-text-quote interoperate with those created using the Hypothesis client, and I don’t expect that will change since Hypothesis needs to remain backwards-compatible with itself.

You’ll find something like TextQuoteSelector in any annotation system. It’s formally defined by the W3C here. In the vast majority of cases this is all you need to describe the selection to which an annotation should anchor.

There are, however, cases where TextQuoteSelector won’t suffice. Consider a document that repeats the same passage three times. Given a short selection in the first of those passages, how can a system know that an annotation should anchor to that one, and not the second or third? Another selector, TextPositionSelector (https://www.npmjs.com/package/dom-anchor-text-position), enables a system to know which passage contains the selection.

{ "type": "TextPositionSelector", "start": 51 "end": 72, }

It records the start and end of the selection in the visible text of an HTML document. Here’s the HTML source of that web page.

<div> <h1>Example Domain</h1> <p>This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.</p> <p><a href="https://www.iana.org/domains/example">More information...</a></p> </div>

Here is the visible text to which the TextQuoteSelector refers.

\n\n Example Domain\n This domain is for use in illustrative examples in documents. You may use this\n domain in literature without prior coordination or asking for permission.\n More information…\n\n\n\n

The positions recorded by a TextQuoteSelector can change for a couple of reasons. If the document is altered, it’s obvious that an annotation’s start and stop numbers might change. Less obviously that can happen even if the document’s text isn’t altered. A news website, for example, may inject different kinds of advertising-related text content from one page load to the next. In that case the positions for two consecutive Hypothesis annotations made on the same selection can differ. So while TextPositionSelector can resolve ambiguity, and provide hints to an annotation system about where to look for matches, the foundation is ultimately TextQuoteSelector.

If you try the first example in the README at https://github.com/judell/TextQuoteAndPosition, you can form your own TextQuoteSelector and TextPositionSelector from a selection in a web page. That repo exists only as a wrapper around the set of modules — dom-anchor-text-quote, dom-anchor-text-position, and wrap-range-text — needed to create and anchor annotations.

Building on these ingredients, HelloWorldAnnotated illustrates a common pattern.

Given a selection in a page, form the selectors needed to post an annotation that targets the selection. Lead a user through an interaction that influences the content of that annotation. Post the annotation.

Here is an example of such an interaction. It’s a content-labeling scenario in which a user rates the emotional affect of a selection. This is the kind of thing that can be done with the stock Hypothesis client, but awkwardly because users must reliably add tags like WeakNegative or StrongPositive to represent their ratings. The app prompts for those tags to ensure consistent use of them.

Although the annotation is created by a standalone app, the Hypothesis client can anchor it, display it, and even edit it.

And the Hypothesis service can search for sets of annotations that match the tags WeakNegative or StrongPositive.

There’s powerful synergy at work here. If your annotation scenario requires controlled tags, or a prescribed workflow, you might want to adapt the Hypothesis client to do those things. But it can be easier to create a standalone app that does exactly what you need, while producing annotations that interoperate with the Hypothesis system.

Anchoring an annotation to its place in a document

Using this same set of modules, a tool or system can retrieve an annotation from a web service and anchor it to a document in the place where it belongs. You can try the second example in the README at https://github.com/judell/TextQuoteAndPosition to see how this works.

For a real-world demonstration of this technique, see Science in the Classroom. It’s a project sponsored by The American Association for the Advancement of Science. Graduate students annotate research papers selected from the Science family of journals so that younger students can learn about the terminology, methods, and outcomes of scientific research.

Pre-Hypothesis, annotations on these papers were displayed using Learning Lens, a viewer that color-codes them by category.

Nothing about Learning Lens changed when Hypothesis came into the picture, it just provided a better way to record the annotations. Originally that was done as it’s often done in the absence of a formal way to describe annotation targets, by passing notes like: “highlight the word ‘proximodistal’ in the first paragraph of the abstract, and attach this note to it.” This kind of thing happens a lot, and wherever it does there’s an opportunity to adopt a more rigorous approach. Nowadays at Science in the Classroom the annotators use Hypothesis to describe where notes should anchor, as well as what they should say. When an annotated page loads it searches Hypothesis for annotations that target the page, and inserts them using the same format that’s always been used to drive the Learning Lens. Tags assigned by annotators align with Learning Lens categories. The search looks only for notes from designated annotators, so nothing unwanted will appear.

An annotation-powered survey

The Credibility Coalition is “a research community that fosters collaborative approaches to understanding the veracity, quality and credibility of online information.” We worked with them on a project to test a set of signals that bear on the credibility of news stories. Examples of such signals include:

Title Representativeness (Does the title of an article accurately reflect its content?) Sources (Does the article cite sources?) Acknowledgement of uncertainty (Does the author acknowledge uncertainty, or the possibility things might be otherwise?)

Volunteers were asked these questions for each of a set of news stories. Many of the questions were yes/no or multiple choice and could have been handled by any survey tool. But some were different. What does “acknowledgement of uncertainty” look like? You know it when you see it, and you can point to examples. But how can a survey tool solicit answers that refer to selections in documents, and record their locations and contexts?

The answer was to create a survey tool that enabled respondents to answer such questions by highlighting one or more selections. Like the HelloWorldAnnotated example above, this was a bespoke client that guided the user through a prescribed workflow. In this case, that workflow was more complex. And because it was defined in a declarative way, the same app can be used for any survey that requires people to provide answers that refer to selections in web documents.

A JavaScript wrapper for the Hypothesis API

The HelloWorldAnnotated example uses functions from a library, hlib, to post an annotation to the Hypothesis service. That library includes functions for searching and posting annotations using the Hypothesis API. It also includes support for interaction patterns common to annotation apps, most of which occur in facet, a standalone tool that searches, displays, and exports sets of annotations. Supported interactions include:

– Authenticating with an API token

– Creating a picklist of groups accessible to the authenticated user

– Assembling and displaying conversation threads

– Parsing annotations

– Editing annotations

– Editing tags

In addition to facet, other tools based on this library include CopyAnnotations and TagRename

A Python wrapper for the Hypothesis API

If you’re working in Python, hypothesis-api is an alternative API wrapper that supports searching for, posting, and parsing annotations.

Notifications

If you’re a publisher who embeds Hypothesis on your site, you can use a wildcard search to find annotations. But it would be helpful to be notified when annotations are posted. h_notify is a tool that uses the Hypothesis API to watch for annotations on individual or wildcard URLs, or from particular users, or in a specified group, or with a specified tag.

When an h_notify-based watcher finds notes in any of these ways, it can send alerts to a Slack channel, or to an email address, or add items to an RSS feed.

At Hypothesis we mainly rely on the Slack option. In this example, user nirgendheim highlighted the word “interacting” in a page on the Hypothesis website.

The watcher sent this notice to our #website channel in Slack.

A member of the support team (Hypothesis handle mdiroberts) saw it there and responded to nirgendheim as shown above. How did nirgendheim know that mdiroberts had responded? The core Hypothesis system sends you an email when somebody replies to one of your notes. h_notify is for bulk monitoring and alerting.

A tiny Hypothesis server

People sometimes ask about connecting the Hypothesis client to an alternate server in order to retain complete control over their data. It’s doable, you can follow the instructions here to build and run your own server, and some people and organizations do that. Depending on need, though, that can entail more effort, and more moving parts, than may be warranted.

Suppose for example you’re part of a team of investigative journalists annotating web pages for a controversial story, or a team of analysts sharing confidential notes on web-based financial reports. The documents you’re annotating are public, but the notes you’re taking in a Hypothesis private group are so sensitive that you’d rather not keep them in the Hypothesis service. You’d ideally like to spin up a minimal server for that purpose: small, simple, and easy to manage within your own infrastructure.

Here’s a proof of concept. This tiny server clocks in just 145 lines of Python with very few dependencies. It uses Python’s batteries-include SQLite module for annotation storage. The web framework is Pyramid only because that’s what I’m familiar with, but could as easily be Flask, the ultra-light framework typically used for this sort of thing.

A tiny app wrapped around those ingredients is all you need to receive JSON payloads from a Hypothesis client, and return JSON payloads when the client searches for annotations to anchor to a page.

The service is dockerized and easy to deploy. To test it I used the fly.io speedrun to create an instance at https://summer-feather-9970.fly.dev. Then I made the handful of small tweaks to the Hypothesis client shown in client-patches.txt. My method for doing that, typical for quick proofs of concept that vary the Hypothesis client in some small way, goes like this:

Clone the Hypothesis client. Edit gulpfile.js to say const IS_PRODUCTION_BUILD = false. This turns off minification so it’s possible to read and debug the client code. Follow the instructions to run the client from a browser extension. After establishing a link between the client repo and browser-extension repo, as per those instructions, use this build command — make build SETTINGS_FILE=settings/chrome-prod.json — to create a browser extension that authenticates to the Hypothesis production service. In a Chromium browser (e.g. Chrome or Edge or Brave) use chrome://extensions, click Load unpacked, and point to the browser-extension/build directory where you built the extension.

This is the easiest way to create a Hypothesis client in which to try quick experiments. There are tons of source files in the repos, but just a handful of bundles and loose files in the built extension. You can run the extension, search and poke around in those bundles, set breakpoints, make changes, and see immediate results.

In this case I only made the changes shown in client-patches.txt:

In options/index.html I added an input box to name an alternate server. In options/options.js I sync that value to the cloud and also to the browser’s localStorage. In the extension bundle I check localStorage for an alternate server and, if present, modify the API request used by the extension to show the number of notes found for a page. In the sidebar bundle I check localStorage for an alternate server and, if present, modify the API requests used to search for, create, update, and delete annotations.

I don’t recommend this cowboy approach for anything real. If I actually wanted to use this tweaked client I’d create branches of the client and the browser-extension, and transfer the changes into the source files where they belong. If I wanted to share it with a close-knit team I’d zip up the extension so colleagues could unzip and sideload it. If I wanted to share more broadly I could upload the extension to the Chrome web store. I’ve done all these things, and have found that it’s feasible — without forking Hypothesis — to maintain branches that maintain small but strategic changes like this one. But when I’m aiming for a quick proof of concept, I’m happy to be a cowboy.

In any event, here’s the proof. With the tiny server deployed to summer-feather-9970.fly.dev, I poked that address into the tweaked client.

And sure enough, I could search for, create, reply to, update, and delete annotations using that 145-line SQLite-backed server.

The client still authenticates to Hypothesis in the usual way, and behaves normally unless you specify an alternate server. In that case, the server knows nothing about Hypothesis private groups. The client sees it as the Hypothesis public layer, but it’s really the moral equivalent of a private group. Others will see it only if they’re running the same tweaked extension and pointing to the same server. You could probably go quite far with SQLite but, of course, it’s easy to see how you’d swap it out for a more robust database like Postgres.

Signposts

I think of these examples as signposts pointing to a coherent SDK for weaving annotation into any document workflow. They show that it’s feasible to decouple and recombine the core operations: creating an annotation based on a selection in a document, and anchoring an annotation to its place in a document. Why decouple? Reasons are as diverse as the document workflows we engage in. The stock Hypothesis system beautifully supports a wide range of scenarios. Sometimes it’s helpful to replace or augment Hypothesis with a custom app that provides a guided experience for annotators and/or an alternative display for readers. The annotation SDK I envision will make it straightforward for developers to build solutions that leverage the full spectrum of possibility.


Simon Willison

Per-project PostgreSQL

Per-project PostgreSQL Jamey Sharp describes an ingenious way of setting up PostgreSQL instances for each of your local development project, without depending on an always-running shared localhost database server. The trick is a shell script which creates a PGDATA folder in the current folder and then instantiates a PostgreSQL server in --single single user mode which listens on a Unix domain so

Per-project PostgreSQL

Jamey Sharp describes an ingenious way of setting up PostgreSQL instances for each of your local development project, without depending on an always-running shared localhost database server. The trick is a shell script which creates a PGDATA folder in the current folder and then instantiates a PostgreSQL server in --single single user mode which listens on a Unix domain socket in that folder, instead of listening on the network. Jamey then uses direnv to automatically configure that PostgreSQL, initializing the DB if necessary, for each of his project folders.

Via John Wiseman


@_Nat Zone

ミュンヘンで行われるEICでキーノートスピーチをします

再来週、9月13日にドイツ・ミュンヘン近郊のウンタ… The post ミュンヘンで行われるEICでキーノートスピーチをします first appeared on @_Nat Zone.

再来週、9月13日にドイツ・ミュンヘン近郊のウンターシュラスハイムで行われるEuropean Identity and Cloud Conference 2021 にキーノートスピーチをしに行ってまいります。1年半ぶりの海外になります。

今回は、ここのところ取り組んでいた「Global Assured Identity Network – GAIN」を、先のSWIFTのCEOであるGottfried Leibbrandt氏と共同で発表いたします。

GAINがいったいどういうものなのかということに関しては、当日を乞ご期待。

Introducing The Global Assured Identity Network (GAIN)
Keynote
Monday, September 13, 2021 19:20—19:40 (CET)

閑話休題 ところで、いい加減2週間隔離やめてもらえないですかね。彼の地は東京の1/3とかしか新型コロナ患者出てないんですが…。

The post ミュンヘンで行われるEICでキーノートスピーチをします first appeared on @_Nat Zone.

Thursday, 02. September 2021

MyDigitalFootprint

Is it better to prevent or correct?

This link gives you access to all the articles and archives for </Hello CDO> “Prevention or Correction” is something society has wrestled with for a very long time. This article focuses on why our experience of “prevention or correction” ideas frames the CDO’s responsibilities and explores a linkage to a company’s approach to data. Almost irrespective of where you reside, we live with
This link gives you access to all the articles and archives for </Hello CDO>

“Prevention or Correction” is something society has wrestled with for a very long time. This article focuses on why our experience of “prevention or correction” ideas frames the CDO’s responsibilities and explores a linkage to a company’s approach to data.


Almost irrespective of where you reside, we live with police, penal, and political systems that cannot fully agree on preventing or correcting. It is not that we disagree with the “why”; is it the “how” that divides us! I am a child of an age when lefthandedness was still seen as something to correct, so we have made some progress.
A fundamental issue is that prevention is the best thinking; but if you prevent it, it does not occur. We are then left with the dilemma, “did you prevent it, or was it not going to occur anyway?” The finance team now ask, “Have we wasted resources on something that would not have happened?”

When something we don’t like does occur, we jump into action as we can correct it. We prioritise, allocate resources and measure them. We (humanity) appear to prefer a problem we can solve (touch, feel, see) and can measure rather than prevent. A proportion of crime is driven by the poor economic situation of a population with no choice. Yet, we keep that same population in the same economic conditions as we have limited resources committed to correction (control.) It is way more messy and complex, and we need both, but prevention is not an easy route. Just think about our current global pandemic, climate change or sustainability. Prevention was a possibility in each case, but we kick into action now correction is needed.

In our new digital and data world, we are facing the same issues of prevention vs correction. Should we prevent data issues or correct data issues, and who owns the problem?

In the data industry right now, we correct and just like the criminal services, we have allocated all our budget to correction, so we don’t have time or resources for prevention. I would be rich if I had a dollar for every time I hear, “We need results, Fish, not an x*x*x*x data philosophy.”

For anyone who follows my views and beliefs, I advocate data quality founded on lineage and provenance (prevention). I am not a fan of the massive industry that has been built up to correct data (correction). I see it as a waste on many levels, but FUD sells to a naive buyer. I am a supporter of having a presence at the source of data to guarantee attestation and assign rights. I cannot get my head around the massive budgets set aside to correct “wrong” data. We believe that data quality is knowing the data is wrong and thinking we can correct it! We have no idea if the correction is correct, and the updated data still lacks attestation. We measure this madness and assign KPI to improve our corrective powers. In this case, because it is data and not humans, prevention is the prefered solution as neither a cure nor correction work at any economic level that can be justified other than your a provider of the service or one of the capital providers. But as already mentioned, prevention is too hard to justify. The same analogy is a recurring theme in the buy (outsource) vs build debate for the new platform, but that is another post.
Unpacking to expose and explore some motivations. We know that the CDO job emerged in 2002, and some 20 years on, it is still in its infancy. Tenures are still short (less than three years), and there is a lack of skills and experience because this data and digital thing are all new compared to marketing, finance, or accounting. As I have written before, our job descriptions for the CDO remain poorly defined, with mandates too broad and allocated resources too limited. All many articles at </Hello CDO> I focus on the conflicts CDO’s have because of the mandates we are given.

Whilst mandates are too broad, thankfully, security and privacy are becoming new roles with their own mandates. The CDO is still being allocated a data transformation mandate but being asked to correct the data whilst doing it. Not surprisingly, most data projects fail as the board remains fixated that the data we have is of value, and we should allocate all our resources correcting and securing it. A bit like the house built on sand, we endlessly commit resources to unpin a house without foundations because it is there, rather than moving to the more secure ground and building the right house. Prevention or correction?

All CDO’s face the classic problem of being asked to solve the world’s debt famine crisis with no budget, no resources and yesterday would be good. Solve the data quality problem by correcting the waste at the end. Because of this commitment to correct data, we find we are trapped. “We should just finish the job as we have gone so far,” is what the finance team says, “prevention is like starting all over again”; it will be too expensive, wreck budgets and we will miss our bonus hurdle as it means a total re-start. The budget is too big so let’s just put in some more foundations into the house built on sand.

It is one of my favourite saying “the reason it is called a shortcut, it because it is a short cut and it misses something out” I often use it when we boil a decision down to an ROI number, believing that all the information needed can be presented in a simple single dimensional number. The correction ideals will always win when we use shortcuts, especially ROI.

Prevention is a hard sell. Correction is a simple sell. Who benefits from an endless correction policy? Probably the CDO! We get to look super busy, it is easy to get the commitment, and no one will argue at the board/ senior leadership team to do anything else. It might take three years to see the transformation has not happened, and there is plenty of demand for CDO’s? Why would a CDO want to support prevention? Why is the CDO role so in question?

Prevention takes time, is complex, and you may not see results during your tenure. I often reflect on what will be my CDO legacy? Correction is instant, looks busy and gets results that you can be measured on. Correction means there is always work to be done. Prevention means you will eventually be out of a job. Since prevention cannot be measured and is hard to justify with an ROI calculation, maybe the CEO needs to focus on measuring the success of analytics?
Note to the CEO Data is complex, and like all expert discipline areas, you will seek advice, opinion and counsel from various sources to help form a balanced view. Data quality is a thorny one as most of those around you will inform you of the benefits of correction over prevention. Correction wins, and it is unlikely any balanced view arguing for prevention.

Perhaps it is worth looking at the CDO job description, focusing on what the KPIs are for and how you shift a focus to the outcomes of the analysis. To improve analysis and outcomes demands better data quality, which correction can only get you so far. You get prevention by the back door.

The Dingle Group

Kilt and Social KYC

In July Vienna Digital Identity hosted a fire side chat with Ingo Ruebe and Mark Cachia on KILT Protocol's role in providing digital identity in the Polkadot ecosystem.

In July Vienna Digital Identity hosted a fire side chat with Ingo Ruebe and Mark Cachia on KILT Protocol's role in providing digital identity in the Polkadot ecosystem.

Ingo and Mark discussed some of the history of Kilt and how it decided work with to the Polkatdot ecosystem and shared Kilt’s new Social KYC offering.

To watch to a recording of the event please check out the link: https://vimeo.com/580422623

My apologies to Ingo and Mark in being so tardy in getting this video up and available. (Michael Shea - Sept 2021)

If interested in getting notifications of upcoming events please join the event group at: https://www.meetup.com/Vienna-Digital-Identity-Meetup/

*The Vienna Digital Identity Meetup is hosted by The Dingle Group and is focused on educating business, societal, legal and technologists on the new opportunities that arise with a high assurance digital identity created by the reduction risk and strengthened provenance. We meet on the 4th Monday of every month, in person (when permitted) in Vienna and online on Zoom. Connecting and educating across borders and oceans.


Identity Praxis, Inc.

ICO’s Child Protection Rules Take Effect Sept. 2, 2021. Are You Ready?

The UK Information Commission’s (ICO) Children’s Code, officially known as the“Age Appropriate Design Code: a code of practice for online services,” after a year grace period, goes into effect Thursday, Sept. 2, 2021. The code, which falls under section 125(1)(b) of the UK Data Protection Act 2018 (the Act), looks to protect UK children, i.e., people […] The post ICO’s Child Protection Rule

The UK Information Commission’s (ICO) Children’s Code, officially known as the“Age Appropriate Design Code: a code of practice for online services,” after a year grace period, goes into effect Thursday, Sept. 2, 2021. The code, which falls under section 125(1)(b) of the UK Data Protection Act 2018 (the Act), looks to protect UK children, i.e., people under the age of eighteen (18).

Are you ready? I hope so; this code applies to any business, in and out of the UK, that provides digital services to UK children. And, it is not the only one of its kind around the world.

What You Need to Know

In the digital age, people’s digital footprint is first established months before they’re born, grows exponentially throughout their lives, and carries on after their death. Kids are especially vulnerable to being influenced by digital services; moreover, they are more likely to befall cybercrime. For instance, a 2011 study by Carnegie Mellon SyLab study found that children were 51% more likely to have their security number used by someone else.

What is it for?

The code is designed to ensure that service providers of apps, programs, toys, or any other devices that collect and process children’s data “are appropriate for use by, and meet the development needs of, children.” It calls for 15 independent and interdependent legal, service design, and data processing principles and standards to be followed. Specifically, as called out in the code, these are:

• Best interests of the child • Data protection impact assessments • Age appropriate application• Transparency • Detrimental use of data • Policies and community standard • Default settings • Data minimization • Data sharing • Geolocation • Parental controls • Profiling • Nudge techniques • Connected toys and devices • Online tools

Who must comply? Risk of non-compliance

Information society services (ISS) that cater to UK children under the age of 18 or whose services are likely to be accessed by children must adhere to the code or risk ICO public assessment and enforcement notices, warnings, reprimands, and penalty notices (aka administrative fines). Serious code breaches may lead to fines of up to €20 million (or £17.5 million when the UK GDPR comes into effect) or 4% of the provider’s annual worldwide turnover, whichever is higher.

What you need to do

Ensuring the digital future and safety of children and venerable adults (people over 18 who cannot meet their own needs or seek help without assistance) is a founding principal of a healthy society and for running a sustainable business. There are several recommended steps you can take to get into and maintain alignment with the code:

1. Consider the likelihood that a child (or vulnerable adult) might use your service, the liability is on the service provider to determine the likelihood a child will use their service. You should conduct user testing, surveys, market research (inc. competitive analysis), and professional and academic literature reviews.

2. Document your data flows (i.e. conduct a data protection impact assessment (DPIA)), your DPIA should consist of data and systems flow diagrams and detailed descriptions of these systems (inc. services, process, and interfaces), all the data flowing through them, and how data is handled. The flow diagram should illustrate each engagement swim lane (e.g., individual, client, company, third-party) and the direction of data flow. The supporting documentation should define each data element and clearly spell out how it will be collected, used and managed. Take it from me and my firsthand experience. The DPIA lens is an invaluable tool for strategic product development. I highly recommend that you not look at the DIPA process as a legal necessity but rather as a valuable framework for learning about and assessing how your products and services work, exactly what they do and why. Performed with the right lens, the DPIA can be a fertile ground for creative inspiration and innovation.

3. Make it a team sport, effective data management is a company-wide, multi-disciplinary activity. It is important that you ensure all key stakeholders, not just legal, IT, security, and compliance, but also marketing, user experience, customer experience, design, support, product, sales, and third-party compliance partners (experts that can help you and your team succeed) all play a role in ensuring your products and services do not just meet legal requirements but exceed people’s expectations.

But There Is More

The above steps are extremely useful; however, there are two more considerations that should not be neglected- 1) things are just getting started, and 2) people’s sentiment—“the opportunity to differentiate.”

First, Gartner estimates that 10% of the world’s population is currently protected under people-centric regulation, i.e. regulations like Europe’s GDPRCalifornia’s CPPABrazil’s LGPD, and China’s data production law which takes effect Nov. 1 2021. By 2023 Gartner estimate this number will rise to 65%. Moreover, keep in mind, it will not just be omnibus rules that take effect; sectoral-specific rules will apply as well. For example, in the United States, the Federal Trade Commission is reevaluating its child protection laws (COPPA) and the U.S. Department of Education Department will more than likely be updating the Family Educational Rights and Privacy Act (FERPA). Moreover, there are state-specific regulations similar to the CCPA being enacted. For Instance, in July 2021, Colorado just enacted their people-centric regulations, “Privacy Act of 2021.”

Second, globally, as evidenced by the last seven years of the MEF Global Consumer Trust studies, people are waking up to industry data practices. Simply saying they’re unhappy about them is an understatement. People are connected, they’re concerned, and they want control of their data. The problem is, they’re not exactly sure how to go about it.

There is more to win than just staying on the right side of the law. There is an opportunity for companies to go beyond the law, to recognize that digital privacy, the controls and flows of one’s personal data, should be treated as sacred as one’s physical privacy. Privacy should not be a luxury good obtained only by a select minority; it’s a human right. All three major societal constituents (individuals, private sector organizations, and public sector institutions) need to play their part. There is an opportunity for each to weigh in on this debate, especially public sector players who can differentiate themselves by actively and publicly providing people not just with the utility of the company’s service but with tools and education that help people proactively enact their data rights and to secure and gain agency over their digital footprint, not just now but throughout their entire life. Like global warming, if we each do our part, we can get data back under control and achieve a healthy equilibrium throughout the world’s markets.

Useful Resources & Tools

• ICO’s Children’s Code of 2020, legal, design, service principles and standards for protecting children’s data.• ICO DPIA Template, note: does not include flow diagrams examples, which is a miss.• UNICEP Better Business for Children, industry-specific guidance to protect children’s rights.• COPPA Safe Harbor Program, U.S. self-regulatory program for the protection of Children’s data (6 companies have been certified).• Ada for Trust, an art exhibit presented at MyData 2019 detailing 6 key “digital” life moments; • UNICEP Better Business for Children, provides guidance and tool

The MEF Personal Data & Identity (PD&I) Working Group will be holding its next meeting on Sept. 20, 2021 at 7:00 AM PDT/2:00 PM GMT. The MEF welcomes marketer leaders looking to gather insight, interact with fellow leaders, and make an impact on the industry to join the MEF and the PD&I working group’s efforts.

The post ICO’s Child Protection Rules Take Effect Sept. 2, 2021. Are You Ready? appeared first on Identity Praxis, Inc..

Wednesday, 01. September 2021

The Dingle Group

Principles or Cult - An Irreverant Discussion on the Principles of SSI

At the end of 2020, the Sovrin Foundation published the 12 Principles of SSI, https://sovrin.org/principles-of-ssi/ . Building on the earlier works from Kim Cameron, and Christopher Allen, these principles layout what must be foundational principles around any self-sovereign identity system. The evolution of the Principles of SSI came about through the need to differentiate what is ‘true’ SSI

At the end of 2020, the Sovrin Foundation published the 12 Principles of SSI, https://sovrin.org/principles-of-ssi/ . Building on the earlier works from Kim Cameron, and Christopher Allen, these principles layout what must be foundational principles around any self-sovereign identity system. The evolution of the Principles of SSI came about through the need to differentiate what is ‘true’ SSI versus marketing forces twisting the concept. This market driven motivator can bring cultish overtones to the process.

With this event we had a bit of fun in what is otherwise 'serious' business.

A few themes that come out of the discussion:

- The position of the holder of the credential - when does the holder become the source of truth and the key to value creation? In most current business value conversations the holder appears to incidental; the focus is on how to incent the verifiers and issuers with little regard to the position and influence of the holder. While the SSI ‘triangle’ puts the holder at the ‘top’ of the triangle, the lack of consideration of the trust and value equation there, implies the triangle should be inverted elevating the issuers and verifiers and putting the holder on the bottom.

- When principles become a means of “locking in” a definition, they become a means of “locking out” those that do not meet the “bar”. A concern voiced was that with this mental model, and the implication of the moral ‘righteousness’ of this position, the real commercial values of SSI are lost and those not in the ‘cult’ turn away.

The panelists are:

Simone Ravaioli, Director Digitary

Rob van Kranenberg, IoT Council and NGI Forward

Nicky Hickman, Chair of the ID4All WG, Sovrin Foundation

Michael Shea, Moderator of Vienna Digital Identity Meetup

To listen to a recording of the event please check out the link: https://vimeo.com/manage/videos/580329620

Time markers:

0:05:00 - Introduction

0:10:07 - Simone Ravaioli

0:17:09 - Nicky Hickman

0:25:44 - Rob van Kranenburg

0:35:00 - General Discussion

1:15:07 - Rob van Kranenburg

1:23:45 - Wrap up


Resources

Slide decks:

Simone Ravaioli : Principles vs Cult (Simone)

Nicky Hickman : Principles vs Cult (Nicky)

Rob van Kranenburg : Slide Deck 1 Slide Deck 2

Blog Posts:

https://www.linkedin.com/pulse/stoic-sovereign-identity-ssi-simone-ravaioli/

If interested in getting notifications of upcoming events please join the event group at: https://www.meetup.com/Vienna-Digital-Identity-Meetup/

The Vienna Digital Identity Meetup is hosted by The Dingle Group and is focused on educating business, societal, legal and technologists on the new opportunities that arise with a high assurance digital identity created by the reduction risk and strengthened provenance. We meet on the 4th Monday of every month, in person (when permitted) in Vienna and online on Zoom. Connecting and educating across borders and oceans.

Tuesday, 31. August 2021

Phil Windley's Technometria

Seeing Like the TSA

Summary: A recent trip through SL airport's new TSA screening area shows why bureaucracies legibility fails. People aren't corporate or hierarchical. Instead they're messy... and innovative. I just flew for the first time in 16 months. In that time, Salt Lake International Airport got a new terminal, including an update to the TSA screening area. The new screening area has been toute

Summary: A recent trip through SL airport's new TSA screening area shows why bureaucracies legibility fails. People aren't corporate or hierarchical. Instead they're messy... and innovative.

I just flew for the first time in 16 months. In that time, Salt Lake International Airport got a new terminal, including an update to the TSA screening area. The new screening area has been touted as a model of efficiency, featuring bin stations for people to load their bags, electronics, belts, and shoes into bins that they then push onto a conveyor. The bins are handled automatically and everything is sunshine and joy. Except it isn't.

The new system is perfect so long as the people using it are too. The first problem is that unless your at the last bin station, the conveyor in front of you is constantly full and it's hard to get your bin onto the conveyor. And if you've got more than one bin to load, they are separated from each other because the loading station isn't big enough for two. People just don't conform to the TSA's ideal!

But the real problem is that people forget things in their pockets or don't take off their belt. In the olden days, the TSA had little bowls. You'd throw your stuff in one, put it on the belt, and be on your way. Now, there's no easy way to accommodate forgotten things except to go back to a bin loading station and put them in a big bin, clogging the conveyor even more. Three people in line ahead of me at the scanner forgot something, causing all kinds of delays. The TSA people were even telling them to just hold them in the scanner and taking them from them to hold while the scan was completed. Because there's no good way to deal with forgotten items, everyone is forced to improvise, but the system is rigid and doesn't easily accommodate improvisation.

The situation reminded me of the story James C. Scott tells in the opening of Seeing Like a State where forestry officials planted neat, efficient rows of trees instead of letting the forest take its natural path. The end result was less yield from the forest, but happier foresters who could now see every tree. Scott's point is that bureaucracy aims for legibility in order to serve its own purposes—and usually fails in that effort. The primary reason states have wanted legibility of citizens is taxes (and, historically, conscription). But once you have legibility, the temptation to extend it to other uses is too great to resist. In this case, the TSA has ordered the screening process and made it legible to the screeners, but made no provision for outliers. If no one forgets anything and the system is lightly loaded, it should work great. Of course, that's not the real world.

IT people are bureaucrats in their own way. We build and operate the systems that people use to do their jobs and live their lives. We strive for legibility in order to make the software simpler for us, even if it doesn't serve the users quite as well. Universities are decentralized places with lots of innovative people pursuing their own goals. They are more feudal than corporate. I've often heard university IT people complain about this reality because it makes their life harder. If you're a professor, you'd like to use whichever LMS suits your particular needs. But that's not very legible. If you're a university IT person, you'd like to force all faculty to use the standard LMS that the university chose. Neat and orderly, but it squeezes the innovation out of the university one drop at a time.

Life is messy. People are forgetful, disorganized, and, relatedly, innovative. Bureaucracy desperately wants legibility so that the rules are followed, the processes perform, and the bureaucrat's life is made easy. Building systems that support decentralized workflows and individual decisions, without getting in the way, is hard. And letting people be people can be frustrating when it's causing you headaches. We'll never build systems that support an authentic, operationalized digital existence until we stop trying to fit people's decentralized lives into our neat, ordered, legible software.

Seeing like a State: How Certain Schemes to Improve the Human Condition Have Failed by James C. Scott

In this wide-ranging and original book, James C. Scott analyzes failed cases of large-scale authoritarian plans in a variety of fields. Centrally managed social plans misfire, Scott argues, when they impose schematic visions that do violence to complex interdependencies that are not—and cannot—be fully understood. Further, the success of designs for social organization depends upon the recognition that local, practical knowledge is as important as formal, epistemic knowledge. The author builds a persuasive case against "development theory" and imperialistic state planning that disregards the values, desires, and objections of its subjects. He identifies and discusses four conditions common to all planning disasters: administrative ordering of nature and society by the state; a "high-modernist ideology" that places confidence in the ability of science to improve every aspect of human life; a willingness to use authoritarian state power to effect large- scale interventions; and a prostrate civil society that cannot effectively resist such plans.

Photo Credit: Security from Anelise Bergin (none)

Tags: legibility decentralization security flying identity university

Monday, 30. August 2021

Simon Willison

Quoting Avery Pennarun

Unshipped work is inventory and it costs you money as it spoils — Avery Pennarun

Unshipped work is inventory and it costs you money as it spoils

Avery Pennarun


Damien Bod

Improving application security in an ASP.NET Core API using HTTP headers – Part 3

This article shows how to improve the security of an ASP.NET Core Web API application by adding security headers to all HTTP API responses. The security headers are added using the NetEscapades.AspNetCore.SecurityHeaders Nuget package from Andrew Lock. The headers are used to protect the session, not for authorization. The application uses Microsoft.Identity.Web to authorize the […]

This article shows how to improve the security of an ASP.NET Core Web API application by adding security headers to all HTTP API responses. The security headers are added using the NetEscapades.AspNetCore.SecurityHeaders Nuget package from Andrew Lock. The headers are used to protect the session, not for authorization. The application uses Microsoft.Identity.Web to authorize the API requests. The security headers are used to protected the session. Swagger is used in development and the CSP needs to be weakened to allow swagger to work during development. A strict CSP definition is used for the deployed environment.

Code: https://github.com/damienbod/AzureAD-Auth-MyUI-with-MyAPI

Blogs in this series

Improving application security in ASP.NET Core Razor Pages using HTTP headers – Part 1 Improving application security in Blazor using HTTP headers – Part 2 Improving application security in an ASP.NET Core API using HTTP headers – Part 3

The NetEscapades.AspNetCore.SecurityHeaders Nuget package is added to the csproj file of the web applications. The Swagger Open API packages are added as well as the Microsoft.Identity.Web to protect the API using OAuth.

<ItemGroup> <PackageReference Include="Microsoft.Identity.Web" Version="1.15.2" /> <PackageReference Include="IdentityModel.AspNetCore" Version="3.0.0" /> <PackageReference Include="NetEscapades.AspNetCore.SecurityHeaders" Version="0.16.0" /> <PackageReference Include="Swashbuckle.AspNetCore" Version="6.1.4" /> <PackageReference Include="Swashbuckle.AspNetCore.Annotations" Version="6.1.4" /> </ItemGroup>

The security header definitions are added using the HeaderPolicyCollection class. I added this to a separate class to keep the Startup class small where the middleware is added. I passed a boolean parameter into the method which is used to add or remove the HSTS header and create a CSP policy depending on the environment.

public static HeaderPolicyCollection GetHeaderPolicyCollection(bool isDev) { var policy = new HeaderPolicyCollection() .AddFrameOptionsDeny() .AddXssProtectionBlock() .AddContentTypeOptionsNoSniff() .AddReferrerPolicyStrictOriginWhenCrossOrigin() .RemoveServerHeader() .AddCrossOriginOpenerPolicy(builder => { builder.SameOrigin(); }) .AddCrossOriginEmbedderPolicy(builder => { builder.RequireCorp(); }) .AddCrossOriginResourcePolicy(builder => { builder.SameOrigin(); }) .RemoveServerHeader() .AddPermissionsPolicy(builder => { builder.AddAccelerometer().None(); builder.AddAutoplay().None(); builder.AddCamera().None(); builder.AddEncryptedMedia().None(); builder.AddFullscreen().All(); builder.AddGeolocation().None(); builder.AddGyroscope().None(); builder.AddMagnetometer().None(); builder.AddMicrophone().None(); builder.AddMidi().None(); builder.AddPayment().None(); builder.AddPictureInPicture().None(); builder.AddSyncXHR().None(); builder.AddUsb().None(); }); AddCspHstsDefinitions(isDev, policy); return policy; }

The AddCspHstsDefinitions defines different policies using the parameter. In development, the HSTS header is not added to the headers and a weak CSP is used so that the Swagger UI will work. This UI uses unsafe inline Javascript and needs to be allowed in development. I remove swagger from all non dev deployments due to this and force a strong CSP definition then.

private static void AddCspHstsDefinitions(bool isDev, HeaderPolicyCollection policy) { if (!isDev) { policy.AddContentSecurityPolicy(builder => { builder.AddObjectSrc().None(); builder.AddBlockAllMixedContent(); builder.AddImgSrc().None(); builder.AddFormAction().None(); builder.AddFontSrc().None(); builder.AddStyleSrc().None(); builder.AddScriptSrc().None(); builder.AddBaseUri().Self(); builder.AddFrameAncestors().None(); builder.AddCustomDirective("require-trusted-types-for", "'script'"); }); // maxage = one year in seconds policy.AddStrictTransportSecurityMaxAgeIncludeSubDomains (maxAgeInSeconds: 60 * 60 * 24 * 365); } else { // allow swagger UI for dev policy.AddContentSecurityPolicy(builder => { builder.AddObjectSrc().None(); builder.AddBlockAllMixedContent(); builder.AddImgSrc().Self().From("data:"); builder.AddFormAction().Self(); builder.AddFontSrc().Self(); builder.AddStyleSrc().Self().UnsafeInline(); builder.AddScriptSrc().Self().UnsafeInline(); //.WithNonce(); builder.AddBaseUri().Self(); builder.AddFrameAncestors().None(); }); } }

In the Startup class, the UseSecurityHeaders method is used to apply the HTTP headers policy and add the middleware to the application. The env.IsDevelopment() is used to add or not to add the HSTS header. The default HSTS middleware from the ASP.NET Core templates was removed from the Configure method as this is not required. The UseSecurityHeaders is added before the swagger middleware so that the security headers are deployment to all environments.

public void Configure(IApplicationBuilder app, IWebHostEnvironment env) { app.UseSecurityHeaders( SecurityHeadersDefinitions.GetHeaderPolicyCollection(env.IsDevelopment())); if (env.IsDevelopment()) { app.UseDeveloperExceptionPage(); app.UseSwagger(); app.UseSwaggerUI(c => { c.SwaggerEndpoint("/swagger/v1/swagger.json", "API v1"); }); }

The server header can be removed in the program class if using Kestrel. If using IIS, you probably need to use the web.config to remove this.

public static IHostBuilder CreateHostBuilder(string[] args) => Host.CreateDefaultBuilder(args) .ConfigureWebHostDefaults(webBuilder => { webBuilder .ConfigureKestrel(options => options.AddServerHeader = false) .UseStartup<Startup>(); });

Running the application using a non development environment, the securtiyheaders.com check returns good results. Everything is closed as this is an API with no UI.

And the https://csp-evaluator.withgoogle.com/ check returns a very possible evaluation of the headers.

If a swagger UI is required, the API application can be run in the development environment. This could also be deployed if required, but in a production deployment, you probably don’t need this.

To support the swagger UI, a weakened CSP is used and the https://csp-evaluator.withgoogle.com/ check returns a more negative result.

Notes:

I block all traffic, if possible, which is not from my domain including sub domains. If implementing enterprise applications, I would always do this. If implementing public facing applications with high traffic volumes or need extra fast response times, or need to reduce the costs of hosting, then CDNs would need to be used, allowed and so on. Try to block all first and open up as required and maybe you can avoid some nasty surprises from all the Javascript, CSS frameworks used.

Links

https://securityheaders.com/

https://csp-evaluator.withgoogle.com/

Security by Default Chrome developers

A Simple Guide to COOP, COEP, CORP, and CORS

https://github.com/andrewlock/NetEscapades.AspNetCore.SecurityHeaders

https://github.com/dotnet/aspnetcore/issues/34428

https://w3c.github.io/webappsec-trusted-types/dist/spec/

https://web.dev/trusted-types/

https://developer.mozilla.org/en-US/docs/Web/HTTP/Cross-Origin_Resource_Policy_(CORP)

https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS

https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies

https://docs.google.com/document/d/1zDlfvfTJ_9e8Jdc8ehuV4zMEu9ySMCiTGMS9y0GU92k/edit

https://scotthelme.co.uk/coop-and-coep/

https://github.com/OWASP/ASVS


Simon Willison

Building a desktop application for Datasette (and weeknotes)

This week I started experimenting with a desktop application version of Datasette - with the goal of providing people who aren't comfortable with the command-line the ability to get Datasette up and running on their own personal computers. Update 8th September 2021: I made a bunch more progress over the week following this post, see Datasette Desktop—a macOS desktop application for Datasette fo

This week I started experimenting with a desktop application version of Datasette - with the goal of providing people who aren't comfortable with the command-line the ability to get Datasette up and running on their own personal computers.

Update 8th September 2021: I made a bunch more progress over the week following this post, see Datasette Desktop—a macOS desktop application for Datasette for details or download the app to try it out.

Why a desktop application?

On Monday I kicked off an enormous Twitter conversation when I posted:

I wonder how much of the popularity of R among some communities in comparison to Python comes down to the fact that with R you can install the RStudio desktop application and you're ready to go

This ties into my single biggest complaint about Python: it's just too hard for people to get started with. Setting up a Python development environment for the first time remains an enormous barrier to entry.

I later put this in stronger terms:

The more I think about this the more frustrated I get, thinking about the enormous amount of human potential that's squandered because the barriers to getting started learning to program are so much higher than they need to be

Which made me think of glass houses. My own Datasette project has exactly the same problem: to run it locally you need to install Python and then install Datasette! Mac users can use Homebrew, but telling newcomers to install Homebrew first isn't particularly welcoming either.

Ideally, I'd like people to be able to install a regular desktop application and start using Datasette that way, without even needing to know that it's written in Python.

There's been an open issue to get Datasette running as a standalone binary using PyInstaller since November 2017, with quite a bit of research.

But I want a UI as well: I don't want to have to teach new users how to install and run a command-line application if I can avoid it.

So I decided to spend some time researching Electron to see how hard it would be to make a basic Datasette desktop application a reality.

Progress so far

The code I've written so far can be found in the simonw/datasette.app repository on GitHub. The app so far does the following:

Run a datasette server on localhost attached to an available port (found using portfinder) which terminates when the app quits. Open a desktop window showing that Datasette instance once the server has started. Allow additional windows onto the same instance to be opened using the "New Window" menu option or the Command+N keyboard shortcut. Provides an "Open Database..." menu option (and Command+O shortcut) which brings up a file picker to allow the user to select a SQLite database file to open - once selected, this is attached to the Datasette instance and any windows showing the Datasette homepage are reloaded.

Here's a video demo showing these features in action:

It's very much an MVP, but I'm encouraged by the progress so far. I think this is enough of a proof of concept to be worth turning this into an actual usable product.

How this all works

There are two components to the application.

The first is a thin Electron shell, responsible for launching the Python server, managing windows and configuring the various desktop menu options used to configure it. The code for that lives in main.js.

The second is a custom Datasette plugin that adds extra functionality needed by the application. Currently this consists of a tiny bit of extra CSS to make the footer stick to the bottom of the window, and a custom API endpoint at /-/open-database-file which is called by the menu option for opening a new database.

Initial impressions of Electron

I know it's cool to knock Electron, but in this case it feels like exactly the right tool for the job. Datasette is already a web application - what I need is a way to hide the configuration of that web application behind an icon, and re-present the interface in a way that feels more like a desktop application.

This is my first time building anything with Electron - here are some of my initial impressions.

The initial getting started workflow is really good. I started out with their Quick Start and was up and running with a barebones application that I could start making changes to in just a few minutes. The documentation is pretty good, but it leans more towards being an API reference. I found myself googling for examples of different things I wanted to do pretty often. The automated testing situation isn't great. I'm using Spectron and Mocha for my initial (very thin) tests - I got them up and running in GitHub Actions, but I've already run into some limitations: For some reason each time I run the tests an Electron window (and datasette Python process) is left running. I can't figure out why this is. There doesn't appear to be a way for tests to trigger menu items, which is frustrating because most of the logic I've written so far deals with menu items! There is an open issue for this dating back to May 2016. I haven't yet managed to package my app. This is clearly going to be the biggest challenge. Up next: packaging the app

I was hoping to get to this before writing up my progress in these weeknotes, but it looks like it's going to be quite a challenge.

In order to produce an installable macOS app (I'll dive into Windows later) I need to do the following:

Build a standalone Datasette executable, complete with the custom plugin, using PyInstaller Sign that binary with an Apple developer certificate Build an Electron application that bundles a copy of that datasette binary Sign the resulting Electron application

I'm expecting figuring this out to be a long-winded and frustrating experience, which is more the fault of Apple than of Electron. I'm tracking my progress on this in issue #7.

Datasette 0.59a2

I pushed out a new alpha of Datasette earlier this week, partly driven by work I was doing on datasette.app.

The biggest new feature in this release is a new plugin hook: register_commands() - which lets plugins add additional commands to Datasette, e.g. datasette verify name-of-file.db.

I released a new plugin that exercises this hook called datasette-verify. Past experience has shown me that it's crucial to ship an example plugin alongside a new hook, to help confirm that the hook design is fit for purpose.

It turns out I didn't need this for datasette.app after all, but it's still a great capability to have!

sqlite-utils 3.17

Quoting the release notes in full:

The sqlite-utils memory command has a new --analyze option, which runs the equivalent of the analyze-tables command directly against the in-memory database created from the incoming CSV or JSON data. (#320) sqlite-utils insert-files now has the ability to insert file contents in to TEXT columns in addition to the default BLOB. Pass the --text option or use content_text as a column specifier. (#319)
evernote-to-sqlite 0.3.2

As a follow-up to last week's work on my personal Dogsheep, I decided to re-import my Evernote notes... and found out that Evernote has changed their export mechanism in ways that broke my tool. Most concerningly their exported XML is even less well-formed than it used to be. This new release works around that.

TIL this week Searching all columns of a table in Datasette Releases this week datasette-verify: 0.1 - 2021-08-28
Verify that files can be opened by Datasette datasette: 0.59a2 - (97 releases total) - 2021-08-28
An open source multi-tool for exploring and publishing data evernote-to-sqlite: 0.3.2 - (5 releases total) - 2021-08-26
Tools for converting Evernote content to SQLite sqlite-utils: 3.17 - (86 releases total) - 2021-08-24
Python CLI utility and library for manipulating SQLite databases

Saturday, 28. August 2021

Simon Willison

Dynamic content for GitHub repository templates using cookiecutter and GitHub Actions

GitHub repository templates were introduced a couple of years ago to provide a mechanism for creating a brand new GitHub repository starting with an initial set of files. They have one big limitation: the repositories that they create share the exact same contents as the template repository. They're basically a replacement for duplicating an existing folder and using that as the starting point f

GitHub repository templates were introduced a couple of years ago to provide a mechanism for creating a brand new GitHub repository starting with an initial set of files.

They have one big limitation: the repositories that they create share the exact same contents as the template repository. They're basically a replacement for duplicating an existing folder and using that as the starting point for a new project.

I'm a big fan of the Python cookiecutter tool, which provides a way to dynamically create new folder structures from user-provided variables using Jinja templates to generate content.

This morning, inspired by this repo by Bruno Rocha, I finally figured out a neat pattern for combining cookiecutter with repository templates to compensate for that missing dynamic content ability.

The result: datasette-plugin-template-repository for creating new Datasette plugins with a single click, python-lib-template-repository for creating new Python libraries and click-app-template-repository for creating Click CLI tools.

Cookiecutter

I maintain three cookiecutter templates at the moment:

simonw/datasette-plugin, for creating new Datasette plugins. I've used that one for dozens of plugins myself. simonw/click-app, which generates a skeleton for a new Click-based command-line tool. Many of my x-to-sqlite tools were built using this. simonw/python-lib, for generating general-purpose Python libraries.

Having installed cookiecutter (pip install cookiecutter) each of these can be used like so:

% cookiecutter gh:simonw/datasette-plugin plugin_name []: visualize counties description []: Datasette plugin for visualizing counties hyphenated [visualize-counties]: underscored [visualize_counties]: github_username []: simonw author_name []: Simon Willison include_static_directory []: y include_templates_directory []:

Cookiecutter prompts for some variables defined in a cookiecutter.json file, then generates the project by evaluating the templates.

The challenge was: how can I run this automatically when a new repository is created from a GitHub repository template? And where can I get those variables from?

Bruno's trick: a self-rewriting repository

Bruno has a brilliant trick for getting this to run, exhibited by this workflow YAML. His workflow starts like this:

name: Rename the project from template on: [push] jobs: rename-project: if: ${{ github.repository != 'rochacbruno/python-project-template' }} runs-on: ubuntu-latest steps: # ...

This means that his workflow only runs on copies of the original repository - the workflow is disabled in the template repository itself by that if: condition.

Then at the end of the workflow he does this:

- uses: stefanzweifel/git-auto-commit-action@v4 with: commit_message: "Ready to clone and code" push_options: --force

This does a force push to replace the contents of the repository with whatever was generated by the rest of the workflow script!

This trick was exactly what I needed to get cookiecutter to work with repository templates.

Gathering variables using the GitHub GraphQL API

All three of my existing cookiecutter templates require the following variables:

A name to use for the generated folder A one-line description to use in the README and in setup.py The GitHub username of the owner of the package The display name of the owner

I need values for all of these before I can run cookiecutter.

It turns out they are all available from the GitHub GraphQL API, which can be called from the initial workflow copied from the repository template!

Here's the GitHub Actions step that does that:

- uses: actions/github-script@v4 id: fetch-repo-and-user-details with: script: | const query = `query($owner:String!, $name:String!) { repository(owner:$owner, name:$name) { name description owner { login ... on User { name } ... on Organization { name } } } }`; const variables = { owner: context.repo.owner, name: context.repo.repo } const result = await github.graphql(query, variables) console.log(result) return result

Here I'm using the actions/github-script action, which provides a pre-configured, authenticated instance of GitHub's octokit/rest.js JavaScript library. You can then provide custom JavaScript that will be executed by the action.

await github.graphql(query, variables) can then execute a GitHub GraphQL query. The query I'm using here gives me back the currenty repository's name and description and the login and display name of the owner of that repository.

GitHub repositories can be owned by either a user or an organization - the ... on User / ... on Organization syntax provides the same result here for both types of nested object.

The output of this GraphQL query looks something like this:

{ "repository": { "name": "datasette-verify", "description": "Verify that files can be opened by Datasette", "owner": { "login": "simonw", "name": "Simon Willison" } } }

I assigned an id of fetch-repo-and-user-details to that step of the workflow, so that the return value from the script could be accessed as JSON in the next step.

Passing those variables to cookiecutter

Cookiecutter defaults to asking for variables interactively, but it also supports passing in those variables as command-line parameters.

Here's part of my next workflow steps that executes cookiecutter using the variables collected by the GraphQL query:

- name: Rebuild contents using cookiecutter env: INFO: ${{ steps.fetch-repo-and-user-details.outputs.result }} run: | export REPO_NAME=$(echo $INFO | jq -r '.repository.name') # Run cookiecutter cookiecutter gh:simonw/python-lib --no-input \ lib_name=$REPO_NAME \ description="$(echo $INFO | jq -r .repository.description)" \ github_username="$(echo $INFO | jq -r .repository.owner.login)" \ author_name="$(echo $INFO | jq -r .repository.owner.name)"

The env: INFO: block exposes an environment variable called INFO to the step, populated with the output of the previous fetch-repo-and-user-details step - a string of JSON.

Then within the body of the step I use jq to extract out the details that I need - first the repository name:

export REPO_NAME=$(echo $INFO | jq -r '.repository.name')

Then I pass the other details directly to cookiecutter as arguments:

cookiecutter gh:simonw/python-lib --no-input \ lib_name=$REPO_NAME \ description="$(echo $INFO | jq -r .repository.description)" \ github_username="$(echo $INFO | jq -r .repository.owner.login)" \ author_name="$(echo $INFO | jq -r .repository.owner.name)"

jq -r ensures that the raw text value is returned by jq, as opposed to the JSON string value which would be wrapped in double quotes.

Cleaning up at the end

Running cookiecutter in this way creates a folder within the root of the repository that duplicates the repository name, something like this:

datasette-verify/datasette-verify

I actually want the contents of that folder to live in the root, so the next step I run is:

mv $REPO_NAME/* . mv $REPO_NAME/.gitignore . mv $REPO_NAME/.github .

Here's my completed workflow.

This almost worked - but when I tried to run it for the first time I got this error:

![remote rejected] (refusing to allow an integration to create or update .github/workflows/publish.yml)

It turns out the credentials provided to GitHub Actions are forbidden from making modifications to their own workflow files!

I can understand why that limitation is in place, but it's frustrating here. For the moment, my workaround is to do this just before pushing the final content back to the repository:

mv .github/workflows .github/rename-this-to-workflows

I leave it up to the user to rename that folder back again when they want to enable the workflows that have been generated for them.

Give these a go

I've set up two templates using this pattern now:

datasette-plugin-template-repository for creating new Datasette plugins - use this template python-lib-template-repository for creating new Python libraries - use this template click-app-template-repository for creating new Python Click CLI tools - use this template

Both of these work the same way: enter a repository name and description, click "Create repository from template" and watch as GitHub copies the new repository and then, a few seconds later, runs the workflow to execute the cookiecutter template to replace the contents with the final result.

You can see examples of repositories that I created using these templates here:

https://github.com/simonw/datasette-plugin-template-repository-demo https://github.com/simonw/python-lib-template-repository-demo https://github.com/simonw/click-app-template-repository-demo

Jon Udell

Working with interdependent Postgres functions and materialized views

In Working with Postgres types I showed an example of a materialized view that depends on a typed set-returning function. Because Postgres knows about that dependency, it won’t allow DROP FUNCTION foo. Instead it requires DROP FUNCTION foo CASCADE. A similar thing happens with materialized views that depend on tables or other materialized views. Let’s … Continue reading Working with interdependent

In Working with Postgres types I showed an example of a materialized view that depends on a typed set-returning function. Because Postgres knows about that dependency, it won’t allow DROP FUNCTION foo. Instead it requires DROP FUNCTION foo CASCADE.

A similar thing happens with materialized views that depend on tables or other materialized views. Let’s build a cascade of views and consider the implications.

create materialized view v1 as ( select 1 as number, 'note_count' as label ); SELECT 1 select * from v1; number | label -------+------- 1 | note_count

Actually, before continuing the cascade, let’s linger here for a moment. This is a table-like object created without using CREATE TABLE and without explicitly specifying types. But Postgres knows the types.

\d v1; Materialized view "public.v1" Column | Type --------+----- number | integer label | text

The read-only view can become a read-write table like so.

create table t1 as (select * from v1); SELECT 1 select * from t1; number | label -------+------- 1 | note_count \d t1 Table "public.v1" Column | Type --------+----- number | integer label | text

This ability to derive a table from a materialized view will come in handy later. It’s also just interesting to see how the view’s implicit types become explicit in the table.

OK, let’s continue the cascade.

create materialized view v2 as ( select number + 1, label from v1 ); SELECT 1 select * from v2; number | label -------+------- 2 | note_count create materialized view v3 as ( select number + 1, label from v2 ); SELECT 1 select * from v3; number | label -------+------- 3 | note_count

Why do this? Arguably you shouldn’t. Laurenz Albe makes that case in Tracking view dependencies in PostgreSQL. Recognizing that it’s sometimes useful, though, he goes on to provide code that can track recursive view dependencies.

I use cascading views advisedly to augment the use of CTEs and functions described in Postgres functional style. Views that refine views can provide a complementary form of the chunking that aids reasoning in an analytics system. But that’s a topic for another episode. In this episode I’ll describe a problem that arose in a case where there’s only a single level of dependency from a table to a set of dependent materialized views, and discuss my solution to that probem.

Here’s the setup. We have an annotation table that’s reloaded nightly. On an internal dashboard we have a chart based on the materialized view annos_at_month_ends_for_one_year which is derived from the annotation table and, as its name suggests, reports annotation counts on a monthly cycle. At the beginning of the nightly load, this happens: DROP TABLE annotation CASCADE. So the derived view gets dropped and needs to be recreated as part of the nightly process. But that’s a lot of unnecessary work for a chart that only changes monthly.

Here are two ways to protect a view from a cascading drop of the table it depends on. Both reside in a SQL script, monthly.sql, that only runs on the first of every month. First, annos_at_month_ends_for_one_year.

drop materialized view annos_at_month_ends_for_one_year; create materialized view annos_at_month_ends_for_one_year as ( with last_days as ( select last_days_of_prior_months( date(last_month_date() - interval '6 year') ) as last_day ), monthly_counts as ( select to_char(last_day, '_YYYY-MM') as end_of_month, anno_count_between( date(last_day - interval '1 month'), last_day ) as monthly_annos from last_days ) select end_of_month, monthly_annos, sum(monthly_annos) over (order by end_of_month asc rows between unbounded preceding and current row ) as cumulative_annos from monthly_counts ) with data;

Because this view depends indirectly on the annotation table — by way of the function anno_count_between — Postgres doesn’t see the dependency. So the view isn’t affected by the cascading drop of the annotation table. It persists until, once a month, it gets dropped and recreated.

What if you want Postgres to know about such a dependency, so that the view will participate in a cascading drop? You can do this.

create materialized view annos_at_month_ends_for_one_year as ( with depends as ( select * from annotation limit 1 ) last_days as ( ), monthly_counts as ( ) select * from monthly_counts;

The depends CTE doesn’t do anything relevant to the query, it just tells Postgres that this view depends on the annotation table.

Here’s another way to protect a view from a cascading drop. This expensive-to-build view depends directly on the annotation table but only needs to be updated monthly. So in this case, cumulative_annotations is a table derived from a temporary materialized view.

create materialized view _cumulative_annotations as ( with data as ( select to_char(created, 'YYYY-MM') as created from annotation group by created ) select data.created, sum(data.count) over ( order by data.created asc rows between unbounded preceding and current row ) from data group by data.created order by data.created drop table cumulative_annotations; create table cumulative_annotations as ( select * from _cumulative_annotations ); drop materialized view _cumulative_annotations;

The table cumulative_annotations is only rebuilt once a month. It depends indirectly on the annotation table but Postgres doesn’t see that, so doesn’t include it in the cascading drop.

Here’s the proof.

-- create a table create table t1 (number int); insert into t1 (number) values (1); INSERT 0 1 select * from t1; number ------- 1 -- derive a view from t1 create materialized view v1 as (select * from t1); SELECT 1 select * from v1 number ------- 1 -- try to drop t1 drop table t1; ERROR: cannot drop table t1 because other objects depend on it DETAIL: materialized view v1 depends on table t1 HINT: Use DROP ... CASCADE to drop the dependent objects too. -- derive an independent table from t1 by way of a matview drop materialized view v1; create materialized view v1 as (select * from t1); SELECT 1 create table t2 as (select * from v1); SELECT 1 -- drop the matview drop materialized view v1; -- drop t1 drop table t1; -- no complaint, and t2 still exists select * from t2; number ------- 1

These are two ways I’ve found to protect a long-lived result set from the cascading drop of a short-lived table on which it depends. You can hide the dependency behind a function, or you can derive an independent table by way of a transient materialized view. I use them interchangeably, and don’t have a strong preference one way or another. Both lighten the load on the analytics server. Materialized views (or tables) that only need to change weekly or monthly, but were being dropped nightly by cascade from core tables, are now recreated only on their appropriate weekly or monthly cycles.


1 https://blog.jonudell.net/2021/07/21/a-virtuous-cycle-for-analytics/
2 https://blog.jonudell.net/2021/07/24/pl-pgsql-versus-pl-python-heres-why-im-using-both-to-write-postgres-functions/
3 https://blog.jonudell.net/2021/07/27/working-with-postgres-types/
4 https://blog.jonudell.net/2021/08/05/the-tao-of-unicode-sparklines/
5 https://blog.jonudell.net/2021/08/13/pl-python-metaprogramming/
6 https://blog.jonudell.net/2021/08/15/postgres-and-json-finding-document-hotspots-part-1/
7 https://blog.jonudell.net/2021/08/19/postgres-set-returning-functions-that-self-memoize-as-materialized-views/
8 https://blog.jonudell.net/2021/08/21/postgres-functional-style/
9 https://blog.jonudell.net/2021/08/26/working-in-a-hybrid-metabase-postgres-code-base/
10 https://blog.jonudell.net/2021/08/28/working-with-interdependent-postgres-functions-and-materialized-views/
11 https://blog.jonudell.net/2021/09/05/metabase-as-a-lightweight-app-server/
12 https://blog.jonudell.net/2021/09/07/the-postgres-repl/

Friday, 27. August 2021

Jon Udell

Working in a hybrid Metabase / Postgres code base

In this series I’m exploring how to work in a code base that lives in two places: Metabase questions that encapsulate chunks of SQL, and Postgres functions/procedures/views that also encapsulate chunks of SQL. To be effective working with this hybrid collection of SQL chunks, I needed to reason about their interrelationships. One way to do … Continue reading Working in a hybrid Metabase / Postgres

In this series I’m exploring how to work in a code base that lives in two places: Metabase questions that encapsulate chunks of SQL, and Postgres functions/procedures/views that also encapsulate chunks of SQL. To be effective working with this hybrid collection of SQL chunks, I needed to reason about their interrelationships. One way to do that entailed the creation of a documentation generator that writes a concordance of callers and callees.

Here’s the entry for the function sparklines_for_guid(_guid).

The called by column says that this function is called from two different contexts. One is a Metabase question, All courses: Weekly activity. If you’re viewing that question in Metabase, you’ll find that its SQL text is simply this:

select * from sparklines_for_guid( {{guid}} )

The same function call appears in a procedure, cache warmer, that preemptively memoizes the function for a set of the most active schools in the system. In either case, you can look up the function in the concordance, view its definition, and review how it’s called.

In the definition of sparkline_for_guid, names of other functions (like guid_hashed_view_exists) appear and are linked to their definitions. Similarly, names of views appearing in SELECT or JOIN contexts are linked to their definitions.

Here’s the entry for the function guid_hashed_view_exists. It is called by sparklines_for_guid as well as by functions that drive panels on the school dashboard. It links to the functions it uses: hash_for_guid and exists_view.

Here’s the entry for the view lms_course_groups which appears as a JOIN target in sparklines_for_guid. This central view is invoked — in SELECT or JOIN context — from many functions, from dependent views, and from Metabase questions.

Metabase questions can also “call” other Metabase questions. In A virtuous cycle for analytics I noted: “Queries can emit URLs in order to compose themselves with other queries.” Here’s an example of that.

This Metabase question (985) calls various linked functions, and is called by two other Metabase questions. Here is one of those.

Because this Metabase question (600) emits an URL that refers to 985, it links to the definition of 985. It also links to the view, top_annotated_domains_last_week, from which it SELECTs.

It was straightforward to include Postgres functions, views, and procedures in the concordance since these live in files that reside in the fileystem under source control. Metabase questions, however, live in a database — in our case, a Postgres database that’s separate from our primary Postgres analytics db. In order to extract them into a file I use this SQL snippet.

select r.id, m.name as dbname, r.name, r.description, r.dataset_query from report_card r join metabase_database m on m.id = cast(r.dataset_query::jsonb->>'database' as int) where not r.archived order by r.id;

The doc generator downloads that Metabase data as a CSV file, queries.csv, and processes it along with the files that contain the definitions of functions, procedures, and views in the Postgres data warehouse. It also emits queries.txt which is a more convenient way to diff changes from one commit to the next.

This technique solved a couple of problems. First, when we were only using Metabase — unaugmented by anything in Postgres — it enabled us to put our Metabase SQL under source control and helped us visualize relationships among queries.

Later, as we augmented Metabase with Postgres functions, procedures, and views, it became even more powerful. Developing a new panel for a school or course dashboard means writing a new memoized function. That process begins as a Metabase question with SQL code that calls existing Postgres functions, and/or JOINs/SELECTs FROM existing Postgres views. Typically it then leads to the creation of new supporting Postgres functions and/or views. All this can be tested by internal users, or even invited external users, in Metabase, with the concordance available to help understand the relationships among the evolving set of functions and views.

When supporting functions and views are finalized, the SQL content of the Metabase question gets wrapped up in a memoized Postgres function that’s invoked from a panel of a dashboard app. At that point the concordance links the new wrapper function to the same set of supporting functions and views. I’ve found this to be an effective way to reason about a hybrid code base as features move from Metabase for prototyping to Postgres in production, while maintaining all the code under source control.

That foundation of source control is necessary, but maybe not sufficient, for a team to consider this whole approach viable. The use of two complementary languages for in-database programming will certainly raise eyebrows, and if it’s not your cup of tea I completely understand. If you do find it appealing, though, one thing you’ll wonder about next will be tooling. I work in VSCode nowadays, for which I’ve not yet found a useful extension for pl/pgsql or pl/python. With metaprogramming life gets even harder for aspiring pl/pgsql or pl/python VSCode extensions. I can envision them, but I’m not holding my breath awaiting them. Meanwhile, two factors enable VSCode to be helpful even without deep language-specific intelligence.

The first factor, and by far the dominant one, is outlining. In Products and capabilities I reflect on how I’ve never adopted an outlining product, but often rely on outlining capability infused into a product. In VSCode that’s “only” basic indentation-driven folding and unfolding. But I find it works remarkably well across SQL queries, views and functions that embed them, CTEs that comprise them, and pl/pgsql or pl/python functions called by them.

The second factor, nascent thus far, is GitHub Copilot. It’s a complementary kind of language intelligence that’s aware of, but not bounded by, what a file extension of .sql or .py implies. It can sometimes discern patterns that mix language syntaxes and offer helpful suggestions. That hasn’t happened often so far, but it’s striking when it does. I don’t yet know the extent to which it may be training me while I train it, or how those interactions might influence others. At this point I’m not a major Copilot booster, but I am very much an interested observer of and participant in the experiment.

All in all, I’m guardedly optimistic that existing or feasible tooling can enable individuals and teams to sanely navigate the hybrid corpus of source code discussed in this series. If you’re along for the ride, you’ll next wonder about debugging and monitoring a system built this way. That’s a topic for a future episode.


1 https://blog.jonudell.net/2021/07/21/a-virtuous-cycle-for-analytics/
2 https://blog.jonudell.net/2021/07/24/pl-pgsql-versus-pl-python-heres-why-im-using-both-to-write-postgres-functions/
3 https://blog.jonudell.net/2021/07/27/working-with-postgres-types/
4 https://blog.jonudell.net/2021/08/05/the-tao-of-unicode-sparklines/
5 https://blog.jonudell.net/2021/08/13/pl-python-metaprogramming/
6 https://blog.jonudell.net/2021/08/15/postgres-and-json-finding-document-hotspots-part-1/
7 https://blog.jonudell.net/2021/08/19/postgres-set-returning-functions-that-self-memoize-as-materialized-views/
8 https://blog.jonudell.net/2021/08/21/postgres-functional-style/
9 https://blog.jonudell.net/2021/08/26/working-in-a-hybrid-metabase-postgres-code-base/
10 https://blog.jonudell.net/2021/08/28/working-with-interdependent-postgres-functions-and-materialized-views/
11 https://blog.jonudell.net/2021/09/05/metabase-as-a-lightweight-app-server/
12 https://blog.jonudell.net/2021/09/07/the-postgres-repl/

Wednesday, 25. August 2021

Simon Willison

Quoting Ron Amadeo

Allo shows the ultimate failure of Google's Minimum Viable Product strategy. MVP works when you have almost no competition, or if you are taking a radically different approach to what's on the market, but it completely falls on its face when you are just straight-up cloning an established competitor. There's no reason to use a half-baked WhatsApp clone when regular WhatsApp exists. — Ron Amadeo

Allo shows the ultimate failure of Google's Minimum Viable Product strategy. MVP works when you have almost no competition, or if you are taking a radically different approach to what's on the market, but it completely falls on its face when you are just straight-up cloning an established competitor. There's no reason to use a half-baked WhatsApp clone when regular WhatsApp exists.

Ron Amadeo


Moxy Tongue

Cult Rule

America has fallen; taken hostage by political cults, our once great bastion of "Individual Liberty" has fallen victim to the mobs ruled by media disinformation, inadequate education, and technical gerrymandering by algorithm. People have become operational devotees to artificial participatory groupings functioning as mobs of influence. This "devotion" to practice has yielded a system at odds with
America has fallen; taken hostage by political cults, our once great bastion of "Individual Liberty" has fallen victim to the mobs ruled by media disinformation, inadequate education, and technical gerrymandering by algorithm. People have become operational devotees to artificial participatory groupings functioning as mobs of influence. This "devotion" to practice has yielded a system at odds with Constitutional integrity.
How did this happen?
Why did Individual people cede authority to a system that abstracts them into legal definitions out of step with observable reality? "We The People" no longer function as actual people, Individuals all. Instead, "We" are a legal abstraction duly represented by a Government once established "of, by, for people". This legalese slight of hand has manifested a bureaucratic system of artificial intelligence, and cult-based political control, wherein Individuals are regarded as a threat to stability.
People in the real world can observe that the only human beings they ever meet or come to know are all Individuals. It is our choices which form cults, our methods that make mobs. Once made, these are hard to unmake for any one person. For they exist to defy and destroy the integrity of Individual empowerment. Groups harvest people by consent, or manipulation, and the harvesting creates a new intent. Seeing clearly this new intent is useful in understanding how/why cults come into power.
In Government funded education, a cult of practice or pedagogy is limited by credentialed membership. The support of one cult member for another is advanced by a caring community of social influence all in harmony with the manner by which their credentialed membership is enforced, and employed. The unionization of labor within this cult of pedagogy further entrenches the practices and support structure that any student or family in any zip code can anticipate receiving education from. Where methods of functional literacy fall short due to rapidly evolving practices in the real world, the structure of practice and labor serves to enforce the pace of available change/progress offered to students and families.
Everyone can see this happening for themselves, and many Individual Teachers will openly and adamantly acknowledge this operational fact, and their support for the stability of labor value tied to their cult of practice is hard to argue with, for as people, they certainly deserve respect for their working efforts. This then is the point wherein the structural outcomes derived by cults begins its insidious assault on communities, cultures, Nations and people, Individuals all.
Cults are not made by "bad people". Cults are made by "bad choices". Enforcing bad choices is the problem of greatest consequence. Look around, where does the enforcement of bad choices come from? How is our civil Society being forced into bad choices that yield outcomes out of step with our intent as a people who create their own way, in life, liberty and the pursuits of happiness guaranteed by our Governing assemblage? 
This is our enemy America. People must kill the tendency towards cults.
Cult rule never solves problems.. it just increases costs on infinite horizon of mass delusion for bureaucratic control. 
Corporate bureaucracies extending from DOD, US CONGRESS, DOE, and every other department of every Federal and State Government all follow this same line of structural management by cult of practice. After retiring, contractors provide service of relationships between the same Government offices and personnel that they once worked with, enabling the efficient processing of opportunity by trust and devotion to a cult of practice. One political cult is elected and operational devotees arrive en masse to take their seats in the administrative machinery availed to their servicing mandates. With each Presidential election, this process can be experienced anew in Washington DC. A city emerges from the cults of influence that give operational meaning to time. 
Taken to its extreme, the predictive outcomes that cult behaviors of practice/behavior/attention produce are easily harvested by propaganda-inducing algorithms used to influence mass sentiment. Advertising methods in social media systems designed to produce cult behavior tendencies for ease of tracking attention and gaining access to social graphs where people influence one another as friends, brothers, sisters, daughters, sons, grandmas & grandpas.. all for the purpose of producing cult outcomes for the predictable harvesting of management value. Any purpose, trackable cult behaviors. Any intent, trackable cult attention. Any manipulation, trackable cult influence. 
Individual liberty that gives rise to a Nation and civil Society derived of, by, for people, Individuals all must be protected, at all costs. This is a structural requirement. People are not "people"; abstractions are not accurate enough. People are Individuals, and their operational participation is founded in the local mud, by the blood running through their veins for the integrity of their local community, culture and way of life.
Individual liberty has enemies, both well documented ones and new ones that appear from time to time. The Chinese Communist Party is one such enemy. Cults that remove Individual responsibility and induce group think are another. A systemic bureaucracy run by harvesting big data under surveillance technologies operating with artificial intelligence across time and space is another. These are insidious organizations of destructive intent, hell bent on capturing the loyalty of people, in lieu of allowing for actual free will expressed by people. Families must guard and protect the light of freedom found inside of each of us as people. 
Bureaucracies around the world are racing towards the use of data aggregated for profound purposes, and are rapidly constituting a new type of "artificial intelligence" that is out of step with the inherent Rights of people, Individuals all.
Unless this aggregated effect of "We" is challenged to show its hand, to prove it exists to empower people, Individuals all, rather than replace them with an abstraction of literature or computer science & engineering, everything that "We The People" hold dear will come to an end, replaced by a new version existing not "of, by, for" each of us, but instead.
The benefits of this transition will be cast broadly; Government-issued welfare benefits, a creator economy preying on the vanities of youth, and the destruction of work in favor of administration, delegation, crowdsourcing, and monetization of the cult attention graph. People will get rich for their compliant participation, while civil society gets weaker, less credulous, and more apt to extreme swings of behavior based on cult support for policies of the day.
You either stand for your Individual Rights, or you have none in the end. Standing means understanding how your Rights came to be, and how the source of your authority is the only protection you possess from cult rule. Mobs are spawning, and data aggregation via social-technical tools graphing the attention of people, and segmenting them by algorithm into easily targeted groups that can be harvested for financial and political gain is rampant. General data illiteracy is Nation-wide, around the entire globe. Political cults feed lies via media disinformation keeping people occupied trying to figure out how to be part of the group they agree with, most of the time.. some of the time.. today, for a specific reason.. always by cult association. Left cult, Right cult.. no people, no Individuals anywhere.
ID is the control parameter of freedom and security in a civil society. The source of your ID matters. In a civil society derived "of, by, for people", the source is obvious; ID comes from you. 
Are you a database? Do you exist in a State Trust?
Rights are for people, not birth certificates. The law knows this, but it can not seem to figure out how to manifest this accurately without falsifying the inherent human authority required to resolve source integrity.
People own root. Individuals all, people are the source of Sovereign integrity, or there is none.
Sovereign source authority has been corrupted, and only people can fix it in America.







Identity Woman

Podcast: Identikit with Michelle Dennedy

For the opening episode of ‘Identikit Sequent X’, Michelle Dennedy welcomes Kaliya Young, also known as The Identity Woman, to Smarter Markets for our latest series examining the evolution of digital identity, and how self-sovereign identity, specifically, can advance a consent-based economy. Kaliya is one of the world’s leading experts in self-sovereign identity and identity […] The post Podcas

For the opening episode of ‘Identikit Sequent X’, Michelle Dennedy welcomes Kaliya Young, also known as The Identity Woman, to Smarter Markets for our latest series examining the evolution of digital identity, and how self-sovereign identity, specifically, can advance a consent-based economy. Kaliya is one of the world’s leading experts in self-sovereign identity and identity […]

The post Podcast: Identikit with Michelle Dennedy appeared first on Identity Woman.


Navigating Digital Identity in Political Economies RxC Talk.

We had a great conversation about digital identity in Political Economies and specifically a paper with a proposal by Bryan Ford. Life on Intersections: Digital Identity in Political Economies Most digital identity systems are centralized (e.g., in big government or technology organizations) or individualistic (e.g., in most blockchain projects). However, being in the world is […] The post Navig

We had a great conversation about digital identity in Political Economies and specifically a paper with a proposal by Bryan Ford. Life on Intersections: Digital Identity in Political Economies Most digital identity systems are centralized (e.g., in big government or technology organizations) or individualistic (e.g., in most blockchain projects). However, being in the world is […]

The post Navigating Digital Identity in Political Economies RxC Talk. appeared first on Identity Woman.


Simon Willison

API Tokens: A Tedious Survey

API Tokens: A Tedious Survey Thomas Ptacek reviews different approaches to implementing secure API tokens, from simple random strings stored in a database through various categories of signed token to exotic formats like Macaroons and Biscuits, both new to me. Macaroons carry a signed list of restrictions with them, but combine it with a mechanism where a client can add their own additional res

API Tokens: A Tedious Survey

Thomas Ptacek reviews different approaches to implementing secure API tokens, from simple random strings stored in a database through various categories of signed token to exotic formats like Macaroons and Biscuits, both new to me.

Macaroons carry a signed list of restrictions with them, but combine it with a mechanism where a client can add their own additional restrictions, sign the combination and pass the token on to someone else.

Biscuits are similar, but "embed Datalog programs to evaluate whether a token allows an operation".

Tuesday, 24. August 2021

Simon Willison

SQLModel

SQLModel A new project by FastAPI creator Sebastián Ramírez: SQLModel builds on top of both SQLAlchemy and Sebastián's Pydantic validation library to provide a new ORM that's designed around Python 3's optional typing. The real brilliance here is that a SQLModel subclass is simultaneously a valid SQLAlchemy ORM model AND a valid Pydantic validation model, saving on duplicate code by allowing the

SQLModel

A new project by FastAPI creator Sebastián Ramírez: SQLModel builds on top of both SQLAlchemy and Sebastián's Pydantic validation library to provide a new ORM that's designed around Python 3's optional typing. The real brilliance here is that a SQLModel subclass is simultaneously a valid SQLAlchemy ORM model AND a valid Pydantic validation model, saving on duplicate code by allowing the same class to be used both for form/API validation and for interacting with the database.


How Discord Stores Billions of Messages

How Discord Stores Billions of Messages Fascinating article from 2017 describing how Discord migrated their primary message store to Cassandra (from MongoDB, but I could easily see them making the same decision if they had started with PostgreSQL or MySQL). The trick with scalable NoSQL databases like Cassandra is that you need to have a very deep understanding of the kinds of queries you will n

How Discord Stores Billions of Messages

Fascinating article from 2017 describing how Discord migrated their primary message store to Cassandra (from MongoDB, but I could easily see them making the same decision if they had started with PostgreSQL or MySQL). The trick with scalable NoSQL databases like Cassandra is that you need to have a very deep understanding of the kinds of queries you will need to answer - and Discord had exactly that. In the article they talk about their desire to eventually migrate to Scylla (a compatible Cassandra alternative written in C++) - in the Hacker News comments they confirm that in 2021 they are using Scylla for a few things but they still have their core messages in Cassandra.

Via Hacker News


Phil Windley's Technometria

Life Will Find a Way

Summary: An old apricot tree reminded me that life is wonderfully decentralized, helping it succeed in harsh conditions. Last summer, I decided to kill this apricot tree. We weren't using the apricots, it was making a mess, and I wanted to do something else with the spot it's in. So, I cut off the branches, drilled holes in the stumps, and poured undiluted Round-Up into them. And yet, y

Summary: An old apricot tree reminded me that life is wonderfully decentralized, helping it succeed in harsh conditions.

Last summer, I decided to kill this apricot tree. We weren't using the apricots, it was making a mess, and I wanted to do something else with the spot it's in. So, I cut off the branches, drilled holes in the stumps, and poured undiluted Round-Up into them. And yet, you'll notice that there's a spot on one branch that has sprouted shoots and leaves this summer.

My first thought was that the tree was struggling to live. But then, I realized that I was anthropomorphizing it. In fact, the tree has no higher brain function, it's not struggling to do anything. Instead, what the picture shows is the miracle of decentralization.

Despite my overall success in killing the tree, some cells didn't get the message. They were able to source water and oxygen to grow. They didn't need permission or direction from central authority. They don't even know or care that they're part of a tree. The cells are programmed to behave in specific ways and conditions were such that the cells on that part of the branch were able to execute their programming. Left alone, they might even produce fruit and manage to reproduce. Amazing resiliance.

I've written before about the decentralized behavior of bees. Like the apricot tree, there is no higher brain function in a bee hive. Each bee goes about its work according to its programming. And yet, the result is a marvelously complex living organism (the hive) that is made up of tens of thousands of individual bees, each doing their own thing, but in a way that behaves consistently, reaches consensus about complex tasks, and achieves important goals.

A few years ago, I read The Vital Question by Nick Lane. The book is a fascinating read about the way energy plays a vital role in how life develops and indeed how evolution progresses. I'm still in awe of the mechanisms necessary to provide energy for even a single-celled organism, let alone something as complex as a human being, a bee hive, or even an apricot tree.

Life succeeds because it's terrifically decentralized. I find that miraculous. Humans have a tough time thinking about decentralized systems and designing them in ways that succeed. The current work in blockchains and interest in decentralization gives me hope that we'll design more systems that use decentralized methods to achieve their ends. The result will be an online world that is less fragile, perhaps even anti-fragile, than what we have now.

The Vital Question: Energy, Evolution, and the Origins of Complex Life by Nick Lane

For two and a half billion years, from the very origins of life, single-celled organisms such as bacteria evolved without changing their basic form. Then, on just one occasion in four billion years, they made the jump to complexity. All complex life, from mushrooms to man, shares puzzling features, such as sex, which are unknown in bacteria. How and why did this radical transformation happen? The answer, Lane argues, lies in energy: all life on Earth lives off a voltage with the strength of a lightning bolt. Building on the pillars of evolutionary theory, Lane’s hypothesis draws on cutting-edge research into the link between energy and cell biology, in order to deliver a compelling account of evolution from the very origins of life to the emergence of multicellular organisms, while offering deep insights into our own lives and deaths.

Tags: decentralization biology

Monday, 23. August 2021

Simon Willison

Quoting Stephen Fry

It’s perhaps a very English thing to find it hard to accept kind words about oneself. If anyone praised me in my early days as a comedy performer I would say, “Oh, nonsense. Shut up. No really, I was dreadful.” I remember going through this red-faced shuffle in the presence of the mighty John Cleese who upbraided me the moment we were alone. ‘You genuinely think you’re being polite and modest, do

It’s perhaps a very English thing to find it hard to accept kind words about oneself. If anyone praised me in my early days as a comedy performer I would say, “Oh, nonsense. Shut up. No really, I was dreadful.” I remember going through this red-faced shuffle in the presence of the mighty John Cleese who upbraided me the moment we were alone. ‘You genuinely think you’re being polite and modest, don’t you?’ ‘Well, you know …’ ‘Don’t you see that when someone hears their compliments contradicted they naturally assume that you must think them a fool? [..] ‘It’s so simple. You just say thank you. You just thank them. How hard is that?’

Stephen Fry


Quoting Tyler Cowen

At critical moments in time, you can raise the aspirations of other people significantly, especially when they are relatively young, simply by suggesting they do something better or more ambitious than what they might have in mind.  It costs you relatively little to do this, but the benefit to them, and to the broader world, may be enormous. — Tyler Cowen

At critical moments in time, you can raise the aspirations of other people significantly, especially when they are relatively young, simply by suggesting they do something better or more ambitious than what they might have in mind.  It costs you relatively little to do this, but the benefit to them, and to the broader world, may be enormous.

Tyler Cowen


Damien Bod

Improving application security in Blazor using HTTP headers – Part 2

This article shows how to improve the security of an ASP.NET Core Blazor application by adding security headers to all HTTP Razor Page responses (Blazor WASM hosted in a ASP.NET Core hosted backend). The security headers are added using the NetEscapades.AspNetCore.SecurityHeaders Nuget package from Andrew Lock. The headers are used to protect the session, not […]

This article shows how to improve the security of an ASP.NET Core Blazor application by adding security headers to all HTTP Razor Page responses (Blazor WASM hosted in a ASP.NET Core hosted backend). The security headers are added using the NetEscapades.AspNetCore.SecurityHeaders Nuget package from Andrew Lock. The headers are used to protect the session, not for authentication. The application is authenticated using OpenID Connect, the security headers are used to protected the session. The authentication is implemented in the Blazor application using the BFF pattern. The WASM client part is just a view of the server rendered trusted backend and cookies are used in the browser. All API calls are same domain only and protected with a cookie and same site.

Code: https://github.com/damienbod/AspNetCore6Experiments

Blogs in this series

Improving application security in ASP.NET Core Razor Pages using HTTP headers – Part 1 Improving application security in Blazor using HTTP headers – Part 2 Improving application security in an ASP.NET Core API using HTTP headers – Part 3

The NetEscapades.AspNetCore.SecurityHeaders and the NetEscapades.AspNetCore.SecurityHeaders.TagHelpers Nuget packages are added to the csproj file of the web application. The tag helpers are added to use the nonce from the CSP in the Razor Pages.

<ItemGroup> <PackageReference Include="NetEscapades.AspNetCore.SecurityHeaders" Version="0.16.0" /> <PackageReference Include="NetEscapades.AspNetCore.SecurityHeaders.TagHelpers" Version="0.16.0" /> </ItemGroup>

The Blazor definition is very similar to the ASP.NET Core Razor Page one. The main difference is that the CSP script policy is almost disabled due to the Blazor script requirements. We can at least force self on the content security policy header script-src definition.

The Blazor WASM logout link sends a HTTP Form POST request which is redirected to the OpenID Connect identity provider. The CSP needs to allow this redirect and the content secure policy form definition allows this.

public static HeaderPolicyCollection GetHeaderPolicyCollection( bool isDev, string identityProviderHost) { var policy = new HeaderPolicyCollection() .AddFrameOptionsDeny() .AddXssProtectionBlock() .AddContentTypeOptionsNoSniff() .AddReferrerPolicyStrictOriginWhenCrossOrigin() .RemoveServerHeader() .AddCrossOriginOpenerPolicy(builder => { builder.SameOrigin(); }) .AddCrossOriginEmbedderPolicy(builder => { builder.RequireCorp(); }) .AddCrossOriginResourcePolicy(builder => { builder.SameOrigin(); }) .AddContentSecurityPolicy(builder => { builder.AddObjectSrc().None(); builder.AddBlockAllMixedContent(); builder.AddImgSrc().Self().From("data:"); builder.AddFormAction().Self().From(identityProviderHost); builder.AddFontSrc().Self(); builder.AddStyleSrc().Self().UnsafeInline(); builder.AddBaseUri().Self(); builder.AddFrameAncestors().None(); // due to Blazor builder.AddScriptSrc().Self().UnsafeInline().UnsafeEval(); }) .RemoveServerHeader() .AddPermissionsPolicy(builder => { builder.AddAccelerometer().None(); builder.AddAutoplay().None(); builder.AddCamera().None(); builder.AddEncryptedMedia().None(); builder.AddFullscreen().All(); builder.AddGeolocation().None(); builder.AddGyroscope().None(); builder.AddMagnetometer().None(); builder.AddMicrophone().None(); builder.AddMidi().None(); builder.AddPayment().None(); builder.AddPictureInPicture().None(); builder.AddSyncXHR().None(); builder.AddUsb().None(); }); if (!isDev) { // maxage = one year in seconds policy.AddStrictTransportSecurityMaxAgeIncludeSubDomains(maxAgeInSeconds: 60 * 60 * 24 * 365); } return policy; }

Blazor adds the following script to the WASM host file. This means the CSP for scripts cannot be implemented in a good way.

<script> var Module; window.__wasmmodulecallback__(); delete window.__wasmmodulecallback__; </script>

script-src (from CSP evaluator)

‘self’ can be problematic if you host JSONP, Angular or user uploaded files.
‘unsafe-inline’ allows the execution of unsafe in-page scripts and event handlers.
‘unsafe-eval’ allows the execution of code injected into DOM APIs such as eval().

The aspnetcore-browser-refresh.js is also added for hot reload. This also prevents a strong CSP script definition in development. This could be fixed with a dev check in the policy definition. There is no point fixing this, until the wasmmodulecallback script bug is fixed.

I am following the ASP.NET Core issue and hope this can be improved for Blazor.

In the Startup class, the UseSecurityHeaders method is used to apply the HTTP headers policy and add the middleware to the application. The env.IsDevelopment() is used to add or not to add the HSTS header. The default HSTS middleware from the ASP.NET Core templates was removed from the Configure method as this is not required.

public void Configure(IApplicationBuilder app, IWebHostEnvironment env) { if (env.IsDevelopment()) { app.UseDeveloperExceptionPage(); } else { app.UseExceptionHandler("/Error"); } app.UseSecurityHeaders( SecurityHeadersDefinitions .GetHeaderPolicyCollection(env.IsDevelopment(), Configuration["AzureAd:Instance"]));

The server header can be removed in the program class file of the Blazor server project if using Kestrel. If using IIS, you probably need to use the web.config to remove this.

public static IHostBuilder CreateHostBuilder(string[] args) => Host.CreateDefaultBuilder(args) .ConfigureWebHostDefaults(webBuilder => { webBuilder .ConfigureKestrel(options => options.AddServerHeader = false) .UseStartup<Startup>(); });

When we scan the https://securityheaders.com/ you can view the results. You might need to disable the authentication to check this, or provide a public view.

The content security policy has a warning due to the script definition which is required for Blazor.

The https://csp-evaluator.withgoogle.com/ also displays a high severity finding due the the CSP script definition.

Notes:

If the application is fully protected without any public views, the follow redirects checkbox on the security headers needs to be disabled as then you only get the results of the identity provider used to authenticate.

I block all traffic, if possible, which is not from my domain including sub domains. If implementing enterprise applications, I would always do this. If implementing public facing applications with high traffic volumes or need extra fast response times, or need to reduce the costs of hosting, then CDNs would need to be used, allowed and so on. Try to block all first and open up as required and maybe you can avoid some nasty surprises from all the Javascript, CSS frameworks used.

Maybe until the CSP script is fixed for Blazor, you probably should avoid using Blazor for high security applications and use ASP.NET Core Razor Page applications instead.

If you use Blazor together with tokens in Azure AD or Azure B2C and this CSP script bug, you leave yourself open to having your tokens stolen. I would recommend using server authentication with Azure which removes the tokens from the browser and also solves the Azure SPA logout problem. Azure AD, Azure B2C do not support the revocation endpoint or introspection, so it is impossible to invalidate your tokens on a logout. It does not help if the IT admin, Azure monitoring can invalidate tokens using CAE.

Links

https://securityheaders.com/

https://csp-evaluator.withgoogle.com/

Security by Default Chrome developers

A Simple Guide to COOP, COEP, CORP, and CORS

https://github.com/andrewlock/NetEscapades.AspNetCore.SecurityHeaders

https://github.com/dotnet/aspnetcore/issues/34428

https://w3c.github.io/webappsec-trusted-types/dist/spec/

https://web.dev/trusted-types/

https://developer.mozilla.org/en-US/docs/Web/HTTP/Cross-Origin_Resource_Policy_(CORP)

https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS

https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies

https://docs.google.com/document/d/1zDlfvfTJ_9e8Jdc8ehuV4zMEu9ySMCiTGMS9y0GU92k/edit

https://scotthelme.co.uk/coop-and-coep/

https://github.com/OWASP/ASVS


Simon Willison

Quoting Dan Sinker

The rapid increase of COVID-19 cases among kids has shattered last year’s oft-repeated falsehood that kids don’t get COVID-19, and if they do, it’s not that bad. It was a convenient lie that was easy to believe in part because we kept most of our kids home. With remote learning not an option now, this year we’ll find out how dangerous this virus is for children in the worst way possible. — Dan

The rapid increase of COVID-19 cases among kids has shattered last year’s oft-repeated falsehood that kids don’t get COVID-19, and if they do, it’s not that bad. It was a convenient lie that was easy to believe in part because we kept most of our kids home. With remote learning not an option now, this year we’ll find out how dangerous this virus is for children in the worst way possible.

Dan Sinker

Sunday, 22. August 2021

Simon Willison

MDN: Subdomain takeovers

MDN: Subdomain takeovers MDN have a page about subdomain takeover attacks that focuses more on CNAME records: if you have a CNAME pointing to a common delegated hosting provider but haven't yet provisioned your virtual host there, someone else might beat you to it and use it for an XSS attack. "Preventing subdomain takeovers is a matter of order of operations in lifecycle management for virtual

MDN: Subdomain takeovers

MDN have a page about subdomain takeover attacks that focuses more on CNAME records: if you have a CNAME pointing to a common delegated hosting provider but haven't yet provisioned your virtual host there, someone else might beat you to it and use it for an XSS attack.

"Preventing subdomain takeovers is a matter of order of operations in lifecycle management for virtual hosts and DNS."

I now understand why Google Cloud make your "prove" your ownership of a domain before they'll let you configure it to host e.g. a Cloud Run instance.

Via @carboia


I stumbled across a nasty XSS hole involving DNS A records

I stumbled across a nasty XSS hole involving DNS A records Found out today that an old subdomain that I had assigned an IP address to via a DNS A record was serving unexpected content - turned out I'd shut down the associated VPS and the IP had been recycled to someone else, so their content was now appearing under my domain. It strikes me that if you got really unlucky this could turn into an X

I stumbled across a nasty XSS hole involving DNS A records

Found out today that an old subdomain that I had assigned an IP address to via a DNS A record was serving unexpected content - turned out I'd shut down the associated VPS and the IP had been recycled to someone else, so their content was now appearing under my domain. It strikes me that if you got really unlucky this could turn into an XSS hole - and that new server could even use Let's Encrypt to obtain an HTTPS certificate for your subdomain.

I've added "audit your A records" to my personal security checklist.


Weeknotes: Getting my personal Dogsheep up and running again

I gave a talk about Dogsheep at Noisebridge's Five Minutes of Fame on Thursday. Just one problem: my regular Dogsheep demo was broken, so I ended up building it from scratch again. In doing so I fixed a few bugs in some Dogsheep tools. Dogsheep on a Digital Ocean droplet The latest iteration of my personal Dogsheep runs on a $20/month 4GB/2CPU Digital Ocean Droplet running Ubuntu 20.04 LTS.

I gave a talk about Dogsheep at Noisebridge's Five Minutes of Fame on Thursday. Just one problem: my regular Dogsheep demo was broken, so I ended up building it from scratch again. In doing so I fixed a few bugs in some Dogsheep tools.

Dogsheep on a Digital Ocean droplet

The latest iteration of my personal Dogsheep runs on a $20/month 4GB/2CPU Digital Ocean Droplet running Ubuntu 20.04 LTS.

It runs a private Datasette instance and a bunch of cron jobs to fetch data from Twitter, GitHub, Foursquare Swarm, Pocket and Hacker News.

It also has copies of my Apple Photos and Apple HealthKit data which I upload manually - plus a copy of my genome for good measure.

Some abbreviated notes on how I set it up, copied from a private GitHub Issues thread:

Create a new Ubuntu droplet, and configure its IP address as the A record for dogsheep.simonwillison.net

Install Python 3 and NGINX and SQLite: apt-get install python3 python3-venv nginx sqlite -y

Use letsencrypt to get an HTTPS certificate for it: apt-get update and then apt install certbot python3-certbot-nginx -y, then certbot --nginx -d dogsheep.simonwillison.net

I had to remove the ipv6only=on; bit from the NGINX configuration due to this bug

Created a dogsheep user, useradd -s /bin/bash -d /home/dogsheep/ -m -G

As that user, created a virtual environment: python3 -mvenv datasette-venv and then datasette-venv/bin/pip install wheel and datasette-venv/bin/pip install datasette datasette-auth-passwords

Created a /etc/systemd/system/datasette.service file with this contents

Created a set of blank SQLite database files in WAL mode in /home/dogsheep using the following:

for f in beta.db twitter.db healthkit.db github.db \ swarm.db photos.db genome.db simonwillisonblog.db \ pocket.db hacker-news.db memories.db do sqlite3 $f vacuum # And enable WAL mode: sqlite3 $f 'PRAGMA journal_mode=WAL;' done

Started the Datasette service: service datasette start

Configured NGINX to proxy to localhost port 8001, using this configuration

It's a few more steps than I'd like, but the end result was a password-protected Datasette instance running against a bunch of SQLite database files on my new server.

With Datasette up and running, the next step was to start loading in data.

Importing my tweets

I started with Twitter. I dropped my Twitter API access credentials into an auth.json file (as described here) and ran the following:

source /home/dogsheep/datasette-venv/bin/activate pip install twitter-to-sqlite twitter-to-sqlite user-timeline /home/dogsheep/twitter.db \ -a /home/dogsheep/auth.json @simonw [###############################-----] 26299/29684 00:02:06

That pulled in all 29,684 of my personal tweets.

(Actually, first it broke with an error, exposing a bug that had already been reported. I shipped a fix for that and tried again and it worked.)

Favourited tweets were a little harder - I have 39,904 favourited tweets, but the Twitter API only returns the most recent 3,200. I grabbed those more recent ones with:

twitter-to-sqlite favorites /home/dogsheep/twitter.db \ -a /home/dogsheep/auth.json

Then I requested my Twitter archive, waited 24 hours and uploaded the resulting like.js file to the server, then ran:

twitter-to-sqlite import twitter.db /tmp/like.js

This gave me an archive_like table with the data from that file - but it wasn't the full tweet representation, just the subset that Twitter expose in the archive export.

The README shows how to inflate those into full tweets:

twitter-to-sqlite statuses-lookup twitter.db \ --sql='select tweetId from archive_like' \ --skip-existing Importing 33,382 tweets [------------------------------------] 0% 00:18:28

Once that was done I wrote additional records into the favorited_by table like so:

sqlite3 twitter.db ' INSERT OR IGNORE INTO favorited_by (tweet, user) SELECT tweetId, 12497 FROM archive_like '

(12497 is my Twitter user ID.)

I also came up with a SQL view that lets me see just media attached to tweets:

sqlite-utils create-view twitter.db media_details " select json_object('img_src', media_url_https, 'width', 400) as img, tweets.full_text, tweets.created_at, tweets.id as tweet_id, users.screen_name, 'https://twitter.com/' || users.screen_name || '/status/' || tweets.id as tweet_url from media join media_tweets on media.id = media_tweets.media_id join tweets on media_tweets.tweets_id = tweets.id join users on tweets.user = users.id order by tweets.id desc "

Now I can visit /twitter/media_details?_where=tweet_id+in+(select+tweet+from+favorited_by+where+user+=+12497) to see the most recent media tweets that I've favourited!

Swarm checkins

Swarm checkins were a lot easier. I needed my previously-created Foursquare API token, and swarm-to-sqlite:

pip install swarm-to-sqlite swarm-to-sqlite /home/dogsheep/swarm.db --token=...

This gave me a full table of my Swarm checkins, which I can visualize using datasette-cluster-map:

Apple HealthKit

I don't yet have full automation for my Apple HealthKit data (collected by my Apple Watch) or my Apple Photos - both require me to run scripts on my laptop to create the SQLite database file and then copy the result to the server via scp.

healthkit-to-sqlite runs against the export.zip that is produced by the Apple Health app on the iPhone's export data button - for me that was a 158MB zip file which I AirDropped to my laptop and converted (after fixing a new bug) like so:

healthkit-to-sqlite ~/Downloads/export.zip healthkit.db Importing from HealthKit [-----------------------------] 2% 00:02:25

I uploaded the resulting 1.5GB healthkit.db file and now I can do things like visualize my 2017 San Francisco Half Marathon run on a map:

Apple Photos

For my photos I use dogsheep-photos, which I described last year in Using SQL to find my best photo of a pelican according to Apple Photos. The short version: I run this script on my laptop:

# Upload original photos to my S3 bucket dogsheep-photos upload photos.db \ ~/Pictures/Photos\ Library.photoslibrary/originals dogsheep-photos apple-photos photos.db \ --image-url-prefix "https://photos.simonwillison.net/i/" \ --image-url-suffix "?w=600" scp photos.db dogsheep:/home/dogsheep/photos.db

photos.db is only 171MB - it contains the metadata, including the machine learning labels, but not the photos themselves.

And now I can run queries for things like photos of food I've taken in 2021:

Automation via cron

I'm still working through the last step, which involves setting up cron tasks to refresh my data periodically from various sources. My crontab currently looks like this:

# Twitter 1,11,21,31,41,51 * * * * /home/dogsheep/datasette-venv/bin/twitter-to-sqlite user-timeline /home/dogsheep/twitter.db -a /home/dogsheep/auth.json --since 4,14,24,34,44,54 * * * * run-one /home/dogsheep/datasette-venv/bin/twitter-to-sqlite mentions-timeline /home/dogsheep/twitter.db -a /home/dogsheep/auth.json --since 11 * * * * run-one /home/dogsheep/datasette-venv/bin/twitter-to-sqlite user-timeline /home/dogsheep/twitter.db cleopaws -a /home/dogsheep/auth.json --since 6,16,26,36,46,56 * * * * run-one /home/dogsheep/datasette-venv/bin/twitter-to-sqlite favorites /home/dogsheep/twitter.db -a /home/dogsheep/auth.json --stop_after=50 # Swarm 25 */2 * * * /home/dogsheep/datasette-venv/bin/swarm-to-sqlite /home/dogsheep/swarm.db --token=... --since=2w # Hacker News data every six hours 35 0,6,12,18 * * * /home/dogsheep/datasette-venv/bin/hacker-news-to-sqlite user /home/dogsheep/hacker-news.db simonw # Re-build dogsheep-beta search index once an hour 32 * * * * /home/dogsheep/datasette-venv/bin/dogsheep-beta index /home/dogsheep/beta.db /home/dogsheep/dogsheep-beta.yml

I'll be expanding this out as I configure more of the Dogsheep tools for my personal instance.

TIL this week Building a specific version of SQLite with pysqlite on macOS/Linux Track timestamped changes to a SQLite table using triggers Histogram with tooltips in Observable Plot Releases this week healthkit-to-sqlite: 1.0.1 - (9 releases total) - 2021-08-20
Convert an Apple Healthkit export zip to a SQLite database twitter-to-sqlite: 0.21.4 - (27 releases total) - 2021-08-20
Save data from Twitter to a SQLite database datasette-block-robots: 1.0 - (5 releases total) - 2021-08-19
Datasette plugin that blocks robots and crawlers using robots.txt sqlite-utils: 3.16 - (85 releases total) - 2021-08-18
Python CLI utility and library for manipulating SQLite databases datasette-debug-asgi: 1.1 - (3 releases total) - 2021-08-17
Datasette plugin for dumping out the ASGI scope

Saturday, 21. August 2021

Simon Willison

The Diátaxis documentation framework

The Diátaxis documentation framework Daniele Procida's model of four types of technical documentation - tutorials, how-to guides, technical reference and explanation - now has a name: Diátaxis.

The Diátaxis documentation framework

Daniele Procida's model of four types of technical documentation - tutorials, how-to guides, technical reference and explanation - now has a name: Diátaxis.


SQLite: STRICT Tables (draft)

SQLite: STRICT Tables (draft) Draft documentation for a feature that sounds like it could be arriving in SQLite 3.37 (the next release) - adding a "STRICT" table-option keyword to a CREATE TABLE statement will cause the table to strictly enforce typing rules for data in that table, rejecting inserts that fail to match the column's datatypes. I've seen many programmers dismiss SQLite due to its

SQLite: STRICT Tables (draft)

Draft documentation for a feature that sounds like it could be arriving in SQLite 3.37 (the next release) - adding a "STRICT" table-option keyword to a CREATE TABLE statement will cause the table to strictly enforce typing rules for data in that table, rejecting inserts that fail to match the column's datatypes.

I've seen many programmers dismiss SQLite due to its loose typing, so this feature is really exciting to me: it will hopefully remove a common objection to embracing SQLite for projects.


Mike Jones: self-issued

OAuth 2.0 JWT-Secured Authorization Request (JAR) is now RFC 9101

The OAuth 2.0 JWT-Secured Authorization Request (JAR) specification has been published as RFC 9101. Among other applications, this specification is used by the OpenID Financial-grade API (FAPI). This is another in the series of RFCs bringing OpenID Connect-defined functionality to OAuth 2.0. Previous such RFCs included “OAuth 2.0 Dynamic Client Registration Protocol” [RFC 7591] and […]

The OAuth 2.0 JWT-Secured Authorization Request (JAR) specification has been published as RFC 9101. Among other applications, this specification is used by the OpenID Financial-grade API (FAPI). This is another in the series of RFCs bringing OpenID Connect-defined functionality to OAuth 2.0. Previous such RFCs included “OAuth 2.0 Dynamic Client Registration Protocol” [RFC 7591] and “OAuth 2.0 Authorization Server Metadata” [RFC 8414].

The abstract of the RFC is:


The authorization request in OAuth 2.0 described in RFC 6749 utilizes query parameter serialization, which means that authorization request parameters are encoded in the URI of the request and sent through user agents such as web browsers. While it is easy to implement, it means that a) the communication through the user agents is not integrity protected and thus, the parameters can be tainted, b) the source of the communication is not authenticated, and c) the communication through the user agents can be monitored. Because of these weaknesses, several attacks to the protocol have now been put forward.


This document introduces the ability to send request parameters in a JSON Web Token (JWT) instead, which allows the request to be signed with JSON Web Signature (JWS) and encrypted with JSON Web Encryption (JWE) so that the integrity, source authentication, and confidentiality properties of the authorization request are attained. The request can be sent by value or by reference.

Thanks to Nat Sakimura and John Bradley for persisting in finishing this RFC!


Jon Udell

Postgres functional style

My dual premises in this series are: – Modern SQL is more valuable as a programming language than you might think (see Markus Winand’s Modern SQL: A lot has changed since SQL-92) – Postgres is more valuable as a programming environment than you might think. (see R0ml Lefkowitz’s The Image of Postgres) As the patron … Continue reading Postgres functional style

My dual premises in this series are:

– Modern SQL is more valuable as a programming language than you might think (see Markus Winand’s Modern SQL: A lot has changed since SQL-92)

– Postgres is more valuable as a programming environment than you might think. (see R0ml Lefkowitz’s The Image of Postgres)

As the patron saint of trailing edge technology it is my duty to explore what’s possible at the intersection of these two premises. The topic for this episode is Postgres functional style. Clearly what I’ve been doing with the combo of pl/python and pl/pgsql is very far from pure functional programming. The self-memoization technique shown in episode 7 is all about mutating state (ed: this means writing stuff down somewhere). But it feels functional to me in the broader sense that I’m using functions to orchestrate behavior that’s expressed in terms of SQL queries.

To help explain what I mean, I’m going to unpack one of the Postgres functions in our library.

count_of_distinct_lms_students_from_to(_guid text, _from date, _to date)

This is a function that accepts a school id (aka guid), a start date, and an end date. Its job is to:

– Find all the courses (groups) for that school (guid)

– Filter to those created between the start and end date

– Find all the users in the filtered set of courses

– Filter to just students (i.e. omit instructors)

– Remove duplicate students (i.e., who are in more than one course)

– Return the count of distinct students at the school who annotated in the date range

The production database doesn’t yet store things in ways friendly to this outcome, so doing all this requires some heavy lifting in the analytics data warehouse. Here’s the function that orchestrates that work.

create function count_of_distinct_lms_students_from_to(_guid text, _from date, _to date) returns bigint as $$ declare count bigint; begin 1 -- all groups active for the guid in the date range 2 with groups as ( 3 select pubid from groups_for_guid(_guid) 4 where group_is_active_from_to(pubid, _from, _to) 5 ), 6 -- usernames in those groups 7 usernames_by_course as ( 8 select 9 pubid, 10 (users_in_group(pubid)).username 11 from groups 12 ), 13 -- filtered to just students 14 students_by_course as ( 15 select * from usernames_by_course 16 where not is_instructor(username, pubid) 17 ) 18 select 19 count (distinct username) 20 from students_by_course into count; return count; end; $$ language plpgsql;

If you think pl/pgsql is old and clunky, then you are welcome to do this in pl/python instead. There’s negligible difference between how they’re written and how fast they run. It’s the same chunk of SQL either way, and it exemplifies the functional style I’m about to describe.

Two kinds of chunking work together here: CTEs (aka common table expressions, aka WITH clauses) and functions. If you’ve not worked with SQL for a long time, as I hadn’t, then CTEs may be unfamiliar. I think of them as pipelines of table transformations in which each stage of the pipeline gets a name. In this example those names are groups (line 2), usernames_by_course (line 7), and students_by_course (line 14).

The pipeline phases aren’t functions that accept parameters, but I still think of them as being function-like in the sense that they encapsulate named chunks of behavior. The style I’ve settled into aims to make each phase of the pipeline responsible for a single idea (“groups active in the range”, “usernames in those groups”), and to express that idea in a short snippet of SQL.

As I’m developing one of these pipelines, I test each phase. To test the first phase, for example, I’d do this in psql or Metabase.

-- all groups active for the guid in the date range with groups as ( select pubid from groups_for_guid('8anU0QwbgC2Cq:canvas-lms') where group_is_active_from_to(pubid, '2021-01-01', '2021-05-01') ) select * from groups;

And I’d spot-check to make sure the selected groups for that school really are in the date range. Then I’d check the next phase.

-- all groups active for the guid in the date range with groups as ( ), -- usernames in those groups usernames_by_course as ( select pubid, (users_in_group(pubid)).username from groups ) select * from usernames_by_course;

After another sanity check against these results, I’d continue to the next phase, and eventually arrive at the final result. It’s the same approach I take with regular expressions. I am unable to visualize everything that’s happening in a complex regex. But I can reason effectively about a pipeline of matches that occur in easier-to-understand named steps.

Ideally each phase in one of these pipelines requires just a handful of lines of code: few enough to fit within the 7 +- 2 limit of working memory. Postgres functions make that possible. Here are the functions used in this 20-line chunk of SQL.

– groups_for_guid(guid): Returns a table of course ids for a school.

– group_is_active_from_to(pubid, _from, _to): Returns true if the group was created in the range.

– users_in_group(pubid): Returns a table of user info for a course.

– is_instructor(username, pubid): Returns true if that user is an instructor.

Two of these, groups_for_guid and users_in_group, are set-returning functions. As noted in Working with Postgres types, they have the option of returning an explicit Postgres type defined elsewhere, or an implicit Postgres type defined inline. As it happens, both do the latter.

create or replace function groups_for_guid(_guid text) returns table( pubid text ) as $$ create or replace function users_in_group (_pubid text) returns table ( groupid text, username text, display_name text ) as $$

The other two, group_is_active_from_to and is_instructor, return boolean values.

All this feels highly readable to me now, but the syntax of line 10 took quite a while to sink in. It helps me to look at what users_in_group(pubid) does in a SELECT context.

select * from users_in_group('4VzA92Yy') groupid | username | display_name ----------+-------------+---------------- 4VzA92Yy | 39vA94AsQp | Brendan Nadeau

Here is an alternate way to write the usernames_by_course CTE at line 7.

-- usernames in those groups usernames_by_course as ( select g.pubid, u.username from groups g join users_in_group(g.pubid) u on g.pubid = u.groupid ) select * from usernames_by_course;

Both do exactly the same thing in very close to the same amount of time. Having mixed the two styles I’m leaning toward the first, but you could go either way or both. What matters more is the mental leverage you wield when writing CTEs and functions together to compose pipelines of transformations, and that others wield when reading and debugging.

I hope I’ve made the case for writing and reading. There’s a case to be made for debugging too, but that’s another episode.


1 https://blog.jonudell.net/2021/07/21/a-virtuous-cycle-for-analytics/
2 https://blog.jonudell.net/2021/07/24/pl-pgsql-versus-pl-python-heres-why-im-using-both-to-write-postgres-functions/
3 https://blog.jonudell.net/2021/07/27/working-with-postgres-types/
4 https://blog.jonudell.net/2021/08/05/the-tao-of-unicode-sparklines/
5 https://blog.jonudell.net/2021/08/13/pl-python-metaprogramming/
6 https://blog.jonudell.net/2021/08/15/postgres-and-json-finding-document-hotspots-part-1/
7 https://blog.jonudell.net/2021/08/19/postgres-set-returning-functions-that-self-memoize-as-materialized-views/
8 https://blog.jonudell.net/2021/08/21/postgres-functional-style/
9 https://blog.jonudell.net/2021/08/26/working-in-a-hybrid-metabase-postgres-code-base/
10 https://blog.jonudell.net/2021/08/28/working-with-interdependent-postgres-functions-and-materialized-views/
11 https://blog.jonudell.net/2021/09/05/metabase-as-a-lightweight-app-server/
12 https://blog.jonudell.net/2021/09/07/the-postgres-repl/

Friday, 20. August 2021

MyDigitalFootprint

Why I think that asking if “AI can be ethical” is the wrong question!

Many ask the question “can AI be ethical?” which then becomes a statement “AI must be ethical!” In reality we do not tend to unpack this because it appears so logical, but maybe it is not as obvious as we would like. In May 2021 I wrote this article “What occurs when physical beings transition to information beings?”  It started to question what happens when an AI does not have the
Many ask the question “can AI be ethical?” which then becomes a statement “AI must be ethical!” In reality we do not tend to unpack this because it appears so logical, but maybe it is not as obvious as we would like.


In May 2021 I wrote this article “What occurs when physical beings transition to information beings?”  It started to question what happens when an AI does not have the same incentives and bias as humans.  It was building on this the idea that an #AI should not make complex decisions about wicked problems that involve compromise. 

There is an implicit assumption in the question “Can AI be ethical?,” that AI is either fundermentall not ethical or is already amoral today but #AI must somehow become ethical and have morals. (or worst it must adopt ours.)  

I am not sure AI cares if it is ethical or not but that is a different piece of thinking which I explored here “Can AI be curious?”.  We know carbon forms can be curious but we worry about a silicon form being curious because of power and the possible correction of our wrong decisions. ? 

Ethics and AI is a topic I have taught both at LSE and SBS, however I continue on the journey of how we (humans) can make better decisions with data and be better ancestors.  Knowing that we ourselves find ethics and morals hard.  I continually search for good thinking in this space as you cannot never have enough different models as we come to the same question from very different contexts, experiences and framings. 

Below I have written out an “algorithm” outlining the thinking and at the end I state the problem statement I have come to.  Starting from asking if “AI can be ethical,” is the wrong question, because it is not the conclusion an AI would reach.  If an AI would ask itself (the start of my “algorithm”) if I (the AI)  can be ethical; it will unpack many aspects of ethics but in the end it will have a fundamental paradox.   


The conclusion the AI might reaches is “given that I have no bias or incentive, will I be so ethically pure in my logical choice that I have to take destructive decisions to create the outcomes that are the most ethical.   But how do I reconcile if destruction is ethical?”

This reminds me of many films where the destruction of the humans appeared to be the only logical conclusion from ending the earth, war and now climate change.  

Thursday, 19. August 2021

Identity Praxis, Inc.

“Here’s a bandaid”​ – musings on the T-Mobile data breach and what we need to do next

This week, T-Mobile acknowledges a new data breach that has affected 40M+ people, and as does every breach, the impact of this event will continue to affect them for years to come (Paz, 2021). To those impacted, T-mobile is offering a bandaid, a free McAfee Identity Protection license. Offering an identity protection monitoring service or similar bandaid […] The post “Here’s a bandaid”

This week, T-Mobile acknowledges a new data breach that has affected 40M+ people, and as does every breach, the impact of this event will continue to affect them for years to come (Paz, 2021). To those impacted, T-mobile is offering a bandaid, a free McAfee Identity Protection license. Offering an identity protection monitoring service or similar bandaid following a breach is an industry-standard practice. The practice of offering the bandaid however provides little salve, what is even worse and unfortunate, as some action is better than nothing, is that most people don’t take advantage of it when it is offered1.

Is T-Mobile’s offer a start? Yes. Is it enough? In my opinion, no. We need to do more. People worldwide are concerned for their privacy and data (Auxier et al., 2019; Langer et al., 2021); they have been for a long time (Warren & Brandeis, 1890), and rightfully so as evidence by the fact that there have been more than 10,000 data breaches since 2005 (2020 in Review Data Breach Report, 2021; “Data Breaches,” 2021; U.S. Data Breaches and Exposed Records 2020, 2021) and 11.5 billion breached passwords have been recorded out in the wild (Haveibeenpwnd?, 2021). And, the problem is only going to get worse as we become more and more connected in the coming years, one estimate predicts that by 2025 people will be interacting with and leaking their identity and data to IoT devices 4,800 times a day, i.e. about every 3.3 seconds (Reinsel et al., 2017), and the ITRC predicts in 2021 we’ll experience the most data breaches ever in a single year; the good news is the number of people impacted will be lower than in previous years (“Data Breaches Are Up 38 Percent in Q2 2021; The ITRC Predicts a New All-Time High by Year’s End,” 2021).

People lack and want control over their physical and digital self (aka data) (DMA & Acxiom, 2015). They want their privacy, they don’t know where to start, they lack the tools and education to manage their data (Babun et al., 2019; “Computer Services & The Harris Poll,” 2019).

Identity Protection Services Is Not Enough: The Harms & Costs Caused by a Data Breach

Identity protection services, like those being offered to the affected, can remind us that we have a problem, that the “Cows” have gotten out of the barn, but they don’t offer a solution. These services will tell someone that a personal attribute, e.g. an email address or social security or government ID number, has been found on the dark web. They rarely tell someone much more, e.g. how their data got leaked in the first place or any other meaningful, actionable insight2. Moreover, they don’t address the emotional, time, economic, physical (inc. life), reputational, relationship, chilling effect, discrimination, thwarted expectations, control, data quality, informed choice, vulnerability, disturbance, autonomy, social, civic, and political harms that people may immediately experience following a breach or that may befall them years after a breach has occurred (Calo, 2010; Citron & Solove, 2021), i.e. long after the identity protection monitoring service bandaid has dried up and fallen off.

The total cost of the potential immediate and long-term harm exposure from a breach far exceeds the $39.99/year value of the identity protection service bandaid. For many, it can take days, weeks, or even years to find out their data was compromised, and it can take many hundreds of hours and upwards of thousands of dollars to recover (“Data Breaches Are Up 38 Percent in Q22021; The ITRC Predicts a New All-Time High by Year’s End,” 2021) from a severe breach or misuse of their data.

And so far we’ve just been talking about “material” past or current harms. What about future harm, i.e. lost opportunity? For example, the lost opportunity to buy a car or house, but you can’t because the breach trashed your credit score and you can’t get the inaccuracies removed. The FTC reported that 20% of people have at least one error in their credit report (Fiano, 2019). Or, the opportunity that can be gained by having control over one’s data (e.g. in the form a personal data store or personal information management system) and using it to learn about one’s self, to more efficiently navigate life, or even profit from one’s own records, attributes, labor, or capital data?

It’s time to give people a seat at the table

The elephant in the room, and one not taken nearly seriously enough, is that our personal data has value and this data and value should be in control of the data subject, i.e. the person that it relates to or is generated by. As the former EU commissioner Maglena Kuneva noted as far back as 2009,

“Personal data is the new oil of the internet and the new currency of the digital world” “Meglena Kuneva – European Consumer Commissioner – Keynote Speech – Roundtable on Online Data Collection, Targeting and Profiling”  Meglena Kuneva – European Consumer Commissioner – Keynote Speech – Roundtable on Online Data Collection, Targeting and Profiling” (2009).

Why is it not addressed? Possibly, because the industry says people don’t care about their data? Or, we think regulations will take care of it. More likely, it is because it threatens the efficiency of existing operations and business models and that it is just not practical at scale today and is too hard to implement at this time (Pinarbasi & Pavagadhi, 2020). In aggregate, people’s data is worth trillions of dollars. Corporations are taking the lion’s share of the benefits while individuals are left holding unmitigated risks. The Identity Nexus equation, the equilibrium state where benefit and risk is equally shared throughout society, is out of balance.

It is time we empower people and give them a share of the riches they are generating, which is worth far more than a free account, recommendation, or article they are getting today. It is time we get The Identity Nexus equation back into balance. Our privacy should not be a luxury good, as it is today. Today people are the entre being served up to industry, primarily in the form of marketing, risk mitigation, and people search (Dixon & Gellman, 2014; “FTC to Study Data Broker Industry’s Collection and Use of Consumer Data,” 2012). It is time we move them off the table, and give them a seat at the table. If we enable them to be active participants in the collection, management, and exchange of their data, the personal, civic, social, and commercial bounties will be plentiful. This is not an idea problem, nor a technology problem, it is an imagination and will problem. The ideas have been with us for decades (Bush, 1945; Laudon, 1996; Personal Data, 2011), and the technology is maturing at a breakneck pace. There are pockets of innovation happening today where people are working on putting people in control of their data, like MyData (see Langford et al. (2020) MyData operators report), the Mobile Ecosystem Forum PD&I working group the Internet Identity Workshops, and the many self-sovereign identity working groups at the W3C Ddecentrailized IdentityDecentralized Identity Foundation or Trust over IP Foundation, and The Good Health Pass Collaborative (a group working on a self-sovereign COVID testing credential), to name just a few. The problem is, we’ve simply gotten too comfortable with the status quo and the collective we simply can’t imagine a different world.

We need systemic change.

“We’re entering an age of personal big data, and its impact on our lives will surpass that of the Internet” (Maney, 2014).

Being reminded that there is a problem is not enough to address the problem. We need to prevent harm, or at least mitigate it, before it occurs, as well as address other harms, i.e. the illicit and legal misuses or non-permitted use of our data, and the lost opportunity that people may realize from having cross-sector access and control of their data. In the end, the individual can be the only one that has a complete view of themselves (Brohman et al., 2003). We need to create opportunities for personal fulfillment. Identity protection is a start. But, what people need is control. Contracts, terms of service, and privacy policies are not enough. Regulation is not enough. Trust in commercial and non-commercial institutions to do “the right thing” is not enough. People need to be in a position where they can “trust but verify.”

Five-pillars of digital sovereignty for the phygital human

To control their digital self, people need a systematic framework to embrace the five pillars of digital sovereignty–awareness, intention & behavior, insurance, rights, and technology–all of which rests on education. People need education to understand the problem, to know how and when to use the utilities and services, and how and when to take specific actions that suit their personal circumstances, their context. Moreover, regarding rights, they need regulation that recognizes privacy harm, not just privacy law (Gilliland, 2019).

As an industry, we should not just be offering bandaids; we should be providing a suite of convenient, unobtrusive, passive and active, value-generating utilities, services and education (aka privacy-enhancing technologies and personal identity management capabilities) that help people take back control of their data, their digital self. We need to put in the time to build exceptional customer experience, user experience, and contextually relevant content.

“Content is King, but Context is God” (Vaynerchuk, 2017)

We live in a connected digital age. We have become phygital beings (physical + digital). Today, for many, the digital part of us has more personal, social, and economic value than the physical self. It is time for people to have control of what matters most–their digital self, alongside their physical self. We need to be whole again.

ENDNOTES Bernard (2020) reported that only 1 in 10 Americans took advantage of the settlement offered following the 2017 Equifax data breach that impacted 147 million Americans. The reality is that due to the prevalence of data sharing and exchange and the sheer number of data breaches tracked insect 2005, it is nearly impossible to track the original source of breached data. It is estimated that there have been well over 10,000 data breaches since 2005 (2020 in Review Data Breach Report, 2021; “Data Breaches,” 2021; U.S. Data Breaches and Exposed Records 2020, 2021). And, according to the ITRC, 2021 is on track to be a record year for data breaches. The number of breaches in Q2 2021 was up 38 percent over Q1 2021. For the year, the H1 2021 breaches account for 76 percent of the 2020 totals. The good news, however, is the total number of people impact in 2021 is going down (“Data Breaches Are Up 38 Percent in Q2 2021; The ITRC Predicts a New All-Time High by Year’s End,” 2021) REFERENCES

2020 in review Data Breach Report (pp. 1–29). (2021). Internet Theft Resource Center. https://notified.idtheftcenter.org/s/

Auxier, B., Rainie, L., Anderson, M., Perrin, A., Kumar, M., & Turner, E. (2019). Americans and Privacy: Concerned, Confused and Feeling Lack of Control Over Their Personal Information. In Pew Research Center: Internet, Science & Tech. https://www.pewresearch.org/internet/2019/11/15/americans-and-privacy-concerned-confused-and-feeling-lack-of-control-over-their-personal-information/

Babun, L., Celik, Z. B., McDaniel, P., & Uluagac, S. (2019). Real-time Analysis of Privacy-(un)aware IoT Applications.

Bernard, T. S. (2020). Equifax Breach Affected 147 Million, but Most Sit Out Settlement. In New York Times (Online). New York Times Company. https://www.nytimes.com/2020/01/22/business/equifax-breach-settlement.html

Brohman, M. K., Watson, R. T., Piccoli, G., & Parasuraman, A. (2003). Data completeness: A key to effective net-based customer service systems. Communications of the ACM, 46(6), 47–51. https://doi.org/10.1145/777313.777339

Bush, V. (1945). As We May Think – Vannevar Bush – The Atlantic. In The Atlantic. http://www.theatlantic.com/magazine/archive/1945/07/as-we-may-think/303881/?single_page=true

Calo, R. (2010). The Boundaries of Privacy Harm. Indiana Law Journal, 86(3). https://papers.ssrn.com/abstract=1641487

Citron, D. K., & Solove, D. J. (2021). Privacy Harms (Public {{Law Research Paper}} ID 3782222). GWU Law School. https://doi.org/10.2139/ssrn.3782222

Consumers have concerns about cybersecurity, value education on best practices. (2019). In HelpNetSecurity. https://www.helpnetsecurity.com/2019/10/07/consumers-cybersecurity-awareness/

Data Breaches. (2021). In Privacy Rights Clearinghouse. https://privacyrights.org/data-breaches

Data Breaches Are Up 38 Percent in Q2 2021; The ITRC Predicts a New All-Time High by Year’s End. (2021). In Identity Theft Resource Center. https://www.idtheftcenter.org/data-breaches-are-up-38-percent-in-q2-2021-the-identity-theft-resource-center-predicts-a-new-all-time-high-by-years-end/

Dixon, P., & Gellman, R. (2014). The Scoring of America: How Secret Consumer Scores Threaten Your Privacy and Your Future(p. 1~89). World Privacy Forum. https://www.ftc.gov/system/files/documents/public_comments/2014/08/00014-92369.pdf

DMA, & Acxiom. (2015). Data privacy: What the consumer really thinks (p. 1*29). The Future Foundation. https://dma.org.uk/uploads/misc/5a857c4fdf846-data-privacy—what-the-consumer-really-thinks-final_5a857c4fdf799.pdf

Fiano, L. (2019). Common errors people find on their credit report – and how to get them fixed. In Consumer Financial Protection Bureau. https://www.consumerfinance.gov/about-us/blog/common-errors-credit-report-and-how-get-them-fixed/

Forum, T. W. E. (2011). Personal Data: The Emergence of a New Asset Class (p. 40). The World Economic Forum. http://www.weforum.org/reports/personal-data-emergence-new-asset-class

FTC to Study Data Broker Industry’s Collection and Use of Consumer Data. (2012). In Federal Trade Commission. http://www.ftc.gov/news-events/press-releases/2012/12/ftc-study-data-broker-industrys-collection-use-consumer-data

Gilliland, D. (2019). Privacy law needs privacy harm. In TheHill. https://thehill.com/opinion/cybersecurity/459427-privacy-law-needs-privacy-harm

Haveibeenpwnd? (2021). Have I Been Pwned: Check if your email has been compromised in a data breach. https://haveibeenpwned.com/

Langer, B., Becker, M., Lacey, J., Betti, D., Craig, T., Berg, S., Imperi, V., & Ibrahim, D. (2021). MEF 7th Global Consumer Trust Report. Mobile Ecosystem Forum.

Langford, J., Poikola, A. ’Jogi’., Janssen, W., Lähteenoja, V., & Rikken, M. (2020). Understanding MyData Operators (pp. 1–40). MyData. https://mydata.org/wp-content/uploads/sites/5/2020/04/Understanding-Mydata-Operators-pages.pdf

Laudon, K. C. (1996). Markets and privacy. Communications of the ACM, 39(9), 92–104. https://doi.org/10.1145/234215.234476

Maney, K. (2014). ’Big Data’ Will Change How You Play, See the Doctor, Even Eat. In Newsweek. https://www.newsweek.com/2014/08/01/big-data-big-data-companies-260864.html

Meglena Kuneva – European Consumer Commissioner – Keynote Speech – Roundtable on Online Data Collection, Targeting and Profiling. (2009). European Commission. https://ec.europa.eu/commission/presscorner/detail/en/SPEECH_09_156

Paz, I. G. (2021). T-Mobile Says Hack Exposed Personal Data of 40 Million People. The New York Times. https://www.nytimes.com/2021/08/18/business/tmobile-data-breach.html

Pinarbasi, A. T., & Pavagadhi, J. (2020). 3 benefits for businesses to adopt PDS. https://iapp.org/news/a/3-benefits-for-businesses-to-adopt-pds/

Reinsel, D., Gantz, J., & Rydning, J. (2017). Data Age 2025: The Evolution of Data to Life-Critical Don’t Focus on Big Data; Focus on the Data That’s Big (pp. 1–25). IDC. https://www.seagate.com/www-content/our-story/trends/files/Seagate-WP-DataAge2025-March-2017.pdf

U.S. Data breaches and exposed records 2020 (p. 1). (2021). Statista. https://www.statista.com/statistics/273550/data-breaches-recorded-in-the-united-states-by-number-of-breaches-and-records-exposed/

Vaynerchuk, G. (2017). Content is King, But Context is God. In GaryVaynerchuk.com. https://www.garyvaynerchuk.com/content-is-king-but-context-is-god/

Warren, S., & Brandeis, L. (1890). The Right to Privacy. Harvard Law Review, 4(5), 193–220.

The post “Here’s a bandaid”​ – musings on the T-Mobile data breach and what we need to do next appeared first on Identity Praxis, Inc..


Simon Willison

lex.go in json5-go

lex.go in json5-go This archived GitHub repository has a beautifully clean and clear example of a hand-written lexer in Go, for the JSON5 format (JSON + comments + multi-line strings). parser.go is worth a look too.

lex.go in json5-go

This archived GitHub repository has a beautifully clean and clear example of a hand-written lexer in Go, for the JSON5 format (JSON + comments + multi-line strings). parser.go is worth a look too.


Jon Udell

Postgres set-returning functions that self-memoize as materialized views

In episode 2 I mentioned three aspects of pl/python that are reasons to use it instead of pl/pgsql: access to Python modules, metaprogramming, and introspection. In episode 5 I discussed metaprogramming, by which I mean using pl/python to compose and run SQL code. This episode features introspection, by which I mean taking advantage of Python’s … Continue reading Postgres set-returning functions th

In episode 2 I mentioned three aspects of pl/python that are reasons to use it instead of pl/pgsql: access to Python modules, metaprogramming, and introspection. In episode 5 I discussed metaprogramming, by which I mean using pl/python to compose and run SQL code. This episode features introspection, by which I mean taking advantage of Python’s inspect module to enable a pl/python function to discover its own name.

Why do that? In this context, so that the function can create a materialized view by joining its own name with the value of its first parameter. Here’s the example from episode 5.

questions_and_answers_for_group(_group_id text) returns setof question_and_answer_for_group as $$ from plpython_helpers import ( exists_group_view, get_caller_name, memoize_view_name ) base_view_name = get_caller_name() view_name = f'{base_view_name}_{_group_id}' if exists_group_view(plpy, view_name): sql = f""" select * from {view_name} """ else: sql = f""" -- SQL THAT RETURNS A SETOF QUESTION_AND_ANSWER_FOR_GROUP """ memoize_view_name(sql, view_name) sql = f""" select * from {view_name} """ return plpy.execute(sql) $$ language plpython3u;

The function drives a panel on the course dashboard. An initial call to, say, questions_and_answers_for_group('P1mQaEEp'), creates the materialized view questions_and_answers_for_group_p1mqaeep and returns SELECT * from the view. Subsequent calls skip creating the view and just return SELECT * from it.

Note that the even though the group name is mixed case; the view name created by Postgres is all lowercase. For example:

create materialized view test_AbC as (select 'ok') with data; SELECT 1 \d test_AbC Materialized view "public.test_abc"

I want to think of this as a form of capability injection, but it’s really more like a capability wrapper. The capability is memoization. A function endowed with it can run a SQL query and cache the resulting rows in a materialized view before returning them to a SQL SELECT context. The wrapper is boilerplate code that discovers the function’s name, checks for the existence of a corresponding view, and if it isn’t found, calls memoize_view_name(sql, view_name) to run an arbitrary chunk of SQL code whose result set matches the function’s type. So in short: this pattern wraps memoization around a set-returning pl/python function.

As noted in episode 5, although memoize_view_name is called from pl/python, it is not itself a pl/python function. It’s a normal Python function in a module that’s accessible to the instance of Python that the Postgres pl/python extension is using. In my case that module is just a few small functions in file called plpython_helpers.py, installed (cough, copied) to user postgres‘s ~/.local/lib/python3.8/site-packages/plpython_helpers.py.

So far, there are only two critical functions in that module: get_caller_name() and memoize_view_name.

Here is get_caller_name().

import re base_view_name = inspect.stack()[1][3].replace('__plpython_procedure_', '') return re.sub(r'_\d+$', '', base_view_name)

The internal name for a pl/python function created by CREATE FUNCTION foo() looks like __plpython_procedure_foo_981048462. What get_caller_name() returns is just foo.

Here’s memoize_view_name().

def memoize_view_name(sql, view_name): sql = sql.replace('\n', ' ') encoded_bytes = base64.b64encode(sql.encode('utf-8')) encoded_str = str(encoded_bytes, 'utf-8') cmd = f"""psql -d h_analytics -c "call memoizer('{encoded_str}', '{view_name}')" """ result = os.system(cmd) print(f'memoize_view_name: {cmd} result: {result}')

Given a chunk of SQL and the name of a view, it converts newlines to spaces, base64-encodes the query text, and invokes psql to call a procedure, memoizer, that does the work of running the SQL query and creating the materialized view from those results. So for example the function that yields sparkline data for a school might look like sparkline_data_for_school('af513ee'), and produce the view sparkline_data_for_school_af513ee.

Why shell out to psql here? It may not be necessary, there may be a way to manage the transaction directly within the function, but if so I haven’t found it. I’m very far from being an expert on transaction semantics and will appreciate guidance here if anyone cares to offer it. Meanwhile, this technique seems to work well. memoizer is a Postgres procedure, not a function. Although “stored procedures” is the term that I’ve always associated with in-database programming, I went pretty far down this path using only CREATE FUNCTION, never CREATE PROCEDURE. When I eventually went there I found the distinction between functions and procedures to be a bit slippery. This StackOverflow answer matches what I’ve observed.

PostgreSQL 11 added stored procedures as a new schema object. You can create a new procedure by using the CREATE PROCEDURE statement.

Stored procedures differ from functions in the following ways:

Stored procedures do not have to return anything, and only return a single row when using INOUT parameters.

You can commit and rollback transactions inside stored procedures, but not in functions.

You execute a stored procedure using the CALL statement rather than a SELECT statement.

Unlike functions, procedures cannot be nested in other DDL commands (SELECT, INSERT, UPDATE, DELETE).

Here is the memoizer procedure. It happens to be written in pl/python but could as easily have been written in pl/pgsql using the built-in Postgres decode function. Procedures, like functions, can be written in either language (or others) and share the common Postgres type system.

create procedure memoizer(_sql text, _view_name text) as $$ import base64 decoded_bytes = base64.b64decode(_sql) decoded_str = str(decoded_bytes, 'utf-8') create = f""" create materialized view if not exists {_view_name} as ( {decoded_str} ) with data; """ plpy.execute(create) permit = f""" grant select on {_view_name} to analytics; """ plpy.execute(permit) $$ language plpython3u;

There’s no plpy.commit() here because psql takes care of that automatically. Eventually I wrote other procedures, some of which do their own committing, but that isn’t needed here.

Of course it’s only possible to shell out to psql from a function because pl/python is an “untrusted” language extension. Recall from episode 1:

The ability to wield any of Python’s built-in or loadable modules inside Postgres brings great power. That entails great responsibility, as the Python extension is “untrusted” (that’s the ‘u’ in ‘plpython3u’) and can do anything Python can do on the host system: read and write files, make network requests.

Using Python’s os.system() to invoke psql is another of those superpowers. It’s not something I do lightly, and if there’s a better/safer way I’m all ears.

Meanwhile, this approach is delivering much value. We have two main dashboards, each of which displays a dozen or so panels. The school dashboard reports on annotation activity across all courses at a school. The course dashboard reports on the documents, and selections within documents, that instructors and students are discussing in the course’s annotation layer. Each panel that appears on the school or course dashboard is the output of a memoized function that is parameterized by a school or course id.

The data warehouse runs on a 24-hour cycle. Within that cycle, the first call to a memoized function takes just as long as it takes to run the SQL wrapped by the function. The cached view only comes into play when the function is called again during the same cycle. That can happen in a few different ways.

– A user reloads a dashboard, or a second user loads it.

– A panel expands or refines the results of another panel. For example, questions_and_answers_for_group() provides a foundation for a family of related functions including:

– questions_asked_by_teacher_answered_by_student()

– questions_asked_by_student_answered_by_teacher()

– questions_asked_by_student_answered_by_student()

– A scheduled job invokes a function in order to cache its results before any user asks for them. For example, the time required to cache panels for school dashboards varies a lot. For schools with many active courses it can take minutes to run those queries, so preemptive memoization matters a lot. For schools with fewer active courses it’s OK to memoize on the fly. This method enables flexible cache policy. Across schools we can decide how many of the most-active ones to cache. Within a school, we can decide which courses to cache, e.g. most recent, or most active. The mechanism to display a dashboard panel is always the same function call. The caching done in support of that function is highly configurable.

Caches, of course, must be purged. Since these materialized views depend on core tables it was enough, early on, to do this for views depending on the annotation table.

drop table annotation cascade;

At a certain point, with a growing number of views built during each cycle, the cascade failed.

ERROR: out of shared memory HINT: You might need to increase max_locks_per_transaction.

That wasn’t the answer. Instead we switched to enumerating views and dropping them individually. Again that afforded great flexibility. We can scan the names in the pg_matviews system table and match all the memoized views, or just those for a subset of schools, or just particular panels on school or course dashboards. Policies that govern the purging of cached views can be as flexible as those that govern their creation.


1 https://blog.jonudell.net/2021/07/21/a-virtuous-cycle-for-analytics/
2 https://blog.jonudell.net/2021/07/24/pl-pgsql-versus-pl-python-heres-why-im-using-both-to-write-postgres-functions/
3 https://blog.jonudell.net/2021/07/27/working-with-postgres-types/
4 https://blog.jonudell.net/2021/08/05/the-tao-of-unicode-sparklines/
5 https://blog.jonudell.net/2021/08/13/pl-python-metaprogramming/
6 https://blog.jonudell.net/2021/08/15/postgres-and-json-finding-document-hotspots-part-1/
7 https://blog.jonudell.net/2021/08/19/postgres-set-returning-functions-that-self-memoize-as-materialized-views/
8 https://blog.jonudell.net/2021/08/21/postgres-functional-style/
9 https://blog.jonudell.net/2021/08/26/working-in-a-hybrid-metabase-postgres-code-base
10 https://blog.jonudell.net/2021/08/28/working-with-interdependent-postgres-functions-and-materialized-views/
11 https://blog.jonudell.net/2021/09/05/metabase-as-a-lightweight-app-server/
12 https://blog.jonudell.net/2021/09/07/the-postgres-repl/


Werdmüller on Medium

The three-dimensional engineer

Hiring scrappy, creative people who bring their whole selves to work. Continue reading on Medium »

Hiring scrappy, creative people who bring their whole selves to work.

Continue reading on Medium »


Doc Searls Weblog

A happy 75th anniversary

My parents (that’s them, Eleanor and Allen Searls) were married on 17 August 1946, seventy-five years and two days ago. I would have posted something then, but I was busy—though not too busy to drop something in Facebook, where much of the readership for this blog, plus the writership of others listed in my old […]

My parents (that’s them, Eleanor and Allen Searls) were married on 17 August 1946, seventy-five years and two days ago. I would have posted something then, but I was busy—though not too busy to drop something in Facebook, where much of the readership for this blog, plus the writership of others listed in my old blogroll, has drifted in the Age of Social Media. Alas, blogging is less social than Facebook, Twitter, Instagram and the chatterteriat. But that doesn’t stop me from blogging anyway.

The wedding took place in Minneapolis, for the convenience of Mom’s family of second and third generation Swedish members of the homesteading diaspora, scattered then around Minnesota, North Dakota and Wisconsin. Pop was from New Jersey, and all his immediate kin were there and in New York. After the wedding the couple came east to briefly occupy the home they rented in North Bergen, New Jersey while mostly hanging at Grandma Searls’ house in Fort Lee (where Pop grew up with his two sisters), and then a short drive west of there in Maywood, where Jan and I grew up. I was born less than a year later, and my sister Jan less than two years after that.

In a comment under my Facebook post, Jan writes,

Mom from ND and Pop from NJ met in Alaska in the middle of WWII. He’d already served in the Costal Artillery in the early 30s but after D-Day came home to join up. They courted by mail after the war while he was with SHAEFE (he loved that acronym: Supreme HQ Allied Expeditionary Forces Europe), and Mom with the Red Cross at a Naval Hospital in Oregon. When he got home, she flew to NJ for 6 days of courtship – at a small shack at the NJ shore with Pop’s entire family! He came to MN the night before the wedding. They fell in love with the dream of having a family and future together, and always said they really fell in love with each other on their honeymoon and were devoted to each other. Mom was 33, Pop was 38, and they’d already lived lives of adventure, full of friends and family. We grew up knowing were blessed to have them as our parents.

I’ve added links. The Shack is still there, by the way.

Alas, Mom passed in ’03 and Pop in ’79. But they were exceptionally fine parents and grandparents. Not all kids are so lucky.

So, a belated toast, in pixels.


Identity Praxis, Inc.

When Payments, Messaging and Digital Identity Meet, MEF LATAM Connects August 18, 2021

I had a thoroughly enjoyable time alongside fellow industry leaders at the Mobile Ecosystem Forum’s MEF Connects LATAM event on August 18, 2021. Speakers: Jason Pareja Jauregui, Product Development Manager – B89 Breno Pilar, Vice President – Ipification Ramy Riad, Vice President Future Messaging – imimobile Sean Whitley, Vice President of Sales – Mitto Michael […] The post When Payments, Messagi

I had a thoroughly enjoyable time alongside fellow industry leaders at the Mobile Ecosystem Forum’s MEF Connects LATAM event on August 18, 2021.

Speakers:

Jason Pareja Jauregui, Product Development Manager – B89 Breno Pilar, Vice President – Ipification Ramy Riad, Vice President Future Messaging – imimobile Sean Whitley, Vice President of Sales – Mitto Michael Becker, CEO – Identity Praxis, Inc.

Watch the video

The post When Payments, Messaging and Digital Identity Meet, MEF LATAM Connects August 18, 2021 appeared first on Identity Praxis, Inc..

Wednesday, 18. August 2021

Identity Praxis, Inc.

A [Mobile] Marketing Authors’​ Top 3 Favorite Resources

To be a mobile marketer is to be a marketer. Mobile is table stakes to marketers, as evidenced by the ubiquitous adoption of mobile devices and services throughout the world in every aspect of our lives. As a marketer, it is important to remember that mobile is not used just for commercial activities; it is […] The post A [Mobile] Marketing Authors’​ Top 3 Favorite Resources appeared first on Id

To be a mobile marketer is to be a marketer.

Mobile is table stakes to marketers, as evidenced by the ubiquitous adoption of mobile devices and services throughout the world in every aspect of our lives. As a marketer, it is important to remember that mobile is not used just for commercial activities; it is used throughout every facet of our lives. We use our mobile devices for everything, from work, play, social, banking, shopping, sex, taxes, death, and religion. No industry, no company, no relationship is immune to mobile. Mobile contributes nearly $4.4 trillion to the world’s GDP (The Mobile Economy 2021, 2021). Mobile plays a role in every customer experience and journey; what Julie Ask (2014) considers micro-moments. With mobile, people are connected, and it is important to recognize that the connected individual is the media, is the point-of-sale display. The connected individual is the future of your business. You must learn to be of service to the connected individual and eventually engage them on their terms (more on this later).

By the Numbers

The numbers are clear. There are,

5.22 billion unique mobile users (66% of the global population (Kemp et al., 2021), which is estimated to grow to 5.7 billion by 2025 (The Mobile Economy 2021, 2021). The mobile device is the great unifying digital platform across all racial groups, at least in the U.S. (Atske & Perrin, 2021). 8.02 billion mobile connections (yes, many of us have more than one mobile device) (Kemp et al., 2021), which is predicted to grow to 8.8 billion by 2025 (The Mobile Economy 2021, 2021), and if you take connected devices (devices without a direct mobile network connection) into account the theoretical connections map goes even higher, 24 billion by 2025 (The Mobile Economy 2021, 2021). Time spent online and in mobile apps is going through the roof; the average person worldwide spends 6h 54m a day on the Internet, 4h 10m a day in mobile apps (Kemp et al., 2021). As of Oct. 2020, mobile device market share (aka web traffic) exceeds desktop computers, today mobile accounts for 55.7% of web traffic compared to 41.5% for desktops and 2.73% for tablets (“Desktop Vs Mobile Vs Tablet Market Share Worldwide,” 2021). Mobile accounts for 31% of retail e-commerce sales in the U.S. (Coppola, 2021)

We could list stat after stat, but the story will be the same. Mobile, connectivity, is here, and it is not going anywhere.

Three Favorite Resources

I was asked were by Stukent,“What are your three top favorite mobile marketing resources?” Sidebar: I’m a co-author of Mobile Marketing Essentials (Hanley et al., 2021).

This is a difficult question to answer, not because I don’t have favorite resources, but because of all of the follow-up questions that come to mind. Resources for what? Understanding the people and their sentiments, needs, wants, and desires? Understanding the connected individual’s behavior? Understanding a specific market sector? Understanding the latest and greatest technologies, where they’ve been, where they are, and where they’re going? Understanding and influencing change on organizational leadership, structure and management? Steps for building out and executing on a multi-channel comms matrix? Methods for measurement and tracking? Estimates for building out a mobile presence—web, apps, messaging? Different engagement tactics at each step along the customer journey? Strategies for mobile enhancing physical media? Understanding the regulatory front and what’s going to happen to all the data to which we’ve become so addicted to, Remember, people are connected, they are concerned, and they want control of their data. Pretty soon, you’ll be accessing data on the connected individual’s terms, not yours. By 2023, per a Gartner prediction, 65% of the world’s population will be protected by modern privacy legislation, that’s up from 10% in 2020 (“Gartner Says By 2023, 65% of the World’s Population Will Have Its Personal Data Covered Under Modern Privacy Regulations,” 2020).

To be an exceptional mobile marketer is to be a polymath. You must be able to look at and evaluate the market and serve your customers through these five lenses—technology, economics, law and regulation, culture, and politics—if you are going to effectively serve people, at scale, and on their terms.

As for what my favorite resources are, rather than pointing specifically to three specific resources, I’d prefer to suggest three resource categories:

The people you serve, the people you serve, are your greatest and most important resource. You must do everything in your power to understand them and fulfill their needs. Keep in mind, as the tools of surveillance capitalism are pulled apart (Zuboff, 2019), it will be even more important for you to find ways to build direct, trusted, bi-directional channels of communications. Your customer database will become your most valued asset. Government and non-governmental organizations, governments, non-governmental organizations, agencies, media properties, and schools (trade groups, sector analysts, consumer advocates, think-tanks/fact tanks, like the PEW Research Center (“Pew Research Center,” 2021) or Chiefmartc (“Chief Marketing Technologist – Marketing Technology Management,” 2021), and working professors and libraries, etc.) are an invaluable treasure-trove of specific and directional consumer and market insight that you can tap into, often for free. MarTech vendors, vendors provide insights on how to engage the connected (mobile) individual by providing blogs, research, and reports. Yes, they’re there to sell you their services, so you need to be careful and try to read through the hyperbole at times, but if you want to learn about a mobile channel, engage the vendor. Get your hands dirty, subscribe to a free trial of their service. Read their reports. Attend their webinars. Build a relationship with them.

We live in a connected world; consequently, mobile, regardless of the form factor and medium— smartphone, tablet, watch, earbud, voice, email, SMS, OTT message, etc.— is here to stay.

REFERENCES

Ask, J. (2014). Micro Moments Are The Next Frontier For Mobile. Forrester. https://www.forrester.com/report/Micro+Moments+Are+The+Next+Frontier+For+Mobile/RES118691

Atske, S., & Perrin, A. (2021). Home broadband adoption, computer ownership vary by race, ethnicity in the U.S. In Pew Research Center. https://www.pewresearch.org/fact-tank/2021/07/16/home-broadband-adoption-computer-ownership-vary-by-race-ethnicity-in-the-u-s/

Chief Marketing Technologist – Marketing Technology Management. (2021). In Chief Marketing Technologist. https://chiefmartec.com/

Coppola, D. (2021). Topic: Mobile commerce in the United States. Statista. https://www.statista.com/topics/1185/mobile-commerce/

Desktop vs Mobile vs Tablet Market Share Worldwide. (2021). In StatCounter Global Stats. https://gs.statcounter.com/platform-market-share/desktop-mobile-tablet

Gartner Says By 2023, 65% of the World’s Population Will Have Its Personal Data Covered Under Modern Privacy Regulations. (2020). Gartner. https://www.gartner.com/en/newsroom/press-releases/2020-09-14-gartner-says-by-2023–65–of-the-world-s-population-w

Hanley, M, McCabe, M., Becker, M. (2021). Mobile Marketing Essentials(Fourth). Stukent. https://www.stukent.com/mobile-marketing-textbook/

Kemp, S., Kepios, Hootsuite, & Social, W. A. (2021). Digital 2021: Global Overview Report. https://datareportal.com/reports/digital-2021-global-overview-report

Pew Research Center. (2021). In Pew Research Center. https://www.pewresearch.org/

The Mobile Economy 2021. (2021). GSMA. https://www.gsma.com/mobileeconomy/wp-content/uploads/2021/07/GSMA_MobileEconomy2021_3.pdf

Zuboff, S. (2019). The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power (1st Edition). PublicAffairs.

The post A [Mobile] Marketing Authors’​ Top 3 Favorite Resources appeared first on Identity Praxis, Inc..


MyDigitalFootprint

Testing the fitness of your organisation's preparedness for data.

Click here to access all the articles and archive for </Hello CDO> Day zero, you have arrived and you have 100 days to plan. (H0) How do you determine if your new company is addressing the underlying issues that hold back data from being what they imagine it can be?  The issues that hold back an organisation from really capturing the value of data are at a  minimum: org struct

Click here to access all the articles and archive for </Hello CDO>

Day zero, you have arrived and you have 100 days to plan. (H0) How do you determine if your new company is addressing the underlying issues that hold back data from being what they imagine it can be? 

The issues that hold back an organisation from really capturing the value of data are at a  minimum: org structures, people issues, a lack of accountability, and incentives. Whilst having a CDO, doing data science, having analytics, using artificial intelligence, testing data quality, and a world class data governance structure make a difference; true transformation will remain a struggle if the structural issues remain.

The question is how do we, as a CDO, test the fitness of our organisation's preparedness for data.  If the results are acceptable to the leadership team is one of politics and is way beyond  the scope of this article.  

To test the fitness of your organisation's preparedness for data, it is worth looking at four areas. Is there a preparedness plan? How they put data to work today. Organizational Capability, and finally Culture.

Is there a preparedness plan?

This is really simple, ask the CEO if there is a preparedness plan. If there is, ask for a copy of it and review it with the framing of data. If there is not one, probably best to ask why not.  There may be one, but check the data of publication (issue), and if it does not address all the areas of responsibility that come under the CDO,  you need to flag that it needs work.

How they put data to work today.

Does the company equally value “small data” as well as big data and data science? Does it hold small data with the same level of importance and priority as big data. Small data is also termed thick data.  It is small in size but has depth of insights.  

The media attention on data science makes it appear far easier than it is. You need to determine if this organisation is based on data science from social media posts or on data science from the commitment of time and resources.  It is worth investigating if there is built-in structural animosity between the data science teams which are invested in driving change, and the rest of the business , which is driven by KPI and incentives.  Where do the teams sit and how are they bridged?

Lastly review past management and board papers and recommendations to determine if there is attestation of data or if the blind are leading the blind.  

Organizational Capability

A company’s organisation is supposed, amongst other things, to make it possible for people to do their work, with control in place to ensure they are doing the right work. Organisational capability can be a major hurdle when it comes to making data work, as divisions are often organised to optimise for different objectives. 

There are numerous issues including management’s confusion and conflation of data, technology, data silos, and the lack of clear responsibility around data roles. When there is a lack of skilled data architects, data engineers, and data quality professionals the organisation just does not have the capability or structure. 

All regular readers of this column know that I often verbally fight against leaders who opine that data is the new oil, preach that data is an asset, and demand data-driven decisions - as they don’t have a clue. The reality for most employees is that “data” to them is just another thing they need in order to do their jobs. Companies don’t sort out ways to create value from data as they don’t teach people how to use it to make better decisions. (my masterclass does) 

Organisation changes needed to address the structural issues are the responsibility of a company’s senior leadership. Yet most leaders appear to be sitting on the sidelines, perhaps fearful themselves or unsure of what to do and waiting for McKinsey, Deloitte, Accenture, PWC, BCG or one of the other global consulting firms to give them an answer.  This is want you are seeking to determine.  

Culture

The implications for all those interested in advancing data for better decisions are profound. Culture is often cited as the biggest barrier to progress with data, I will argue and my analysis confirms it, that it is in fact ontology and epistemology that holds companies back.  The questions to ask the organisation is, “Is there a top level ontology?” and “How do we know what we know is true”  


Note to the CEO

“Testing the fitness of your organisation's preparedness for data”is one of the most important jobs you will delegate.  When interviewing for your next CDO, ask them the question “how would they test this organisation for preparedness?”   There is no point in asking about their data expertise and capability, they would not be sat in front of you if they were not top of their game.   How will they address the politics and policies based on their review.  The point here is that if you like their smooth approach, they are likely the wrong person as you will just get more compromises and not the action focussed delivery required to become that data company you desire. 



Whilst our ongoing agile iteration into information beings is never-ending, there are the first 100 days in the new role. But what to focus on? Well, that rose-tinted period of conflicting priorities is what </Hello, CDO!> is all about. Maintaining sanity when all else has been lost to untested data assumptions is a different problem entirely.


Tuesday, 17. August 2021

Simon Willison

Quoting Alison Green (Ask a Manager)

The way you motivate someone who doesn’t need the money is the same way you should motivate people who do need the money: by giving them meaningful roles with real responsibility where they can see how their efforts contribute to a larger whole, giving them an appropriate amount of ownership over their work and input into decisions that involve that work, providing useful feedback, recognizing th

The way you motivate someone who doesn’t need the money is the same way you should motivate people who do need the money: by giving them meaningful roles with real responsibility where they can see how their efforts contribute to a larger whole, giving them an appropriate amount of ownership over their work and input into decisions that involve that work, providing useful feedback, recognizing their contributions, helping them feel they’re making progress toward things that matter to them, and — importantly — not doing things that de-motivate people (like yelling or constantly shifting goals or generally being a jerk).

Alison Green (Ask a Manager)

Monday, 16. August 2021

Phil Windley's Technometria

Clean Sheets and Strategy

Summary: Most of the activities IT performs aren't strategic. Like clean sheets, they're important, but not differentiating. How do you determine what's strategic? Evaluate the domains of your business. Suppose you're in the hotel business. One of the things you have to do is make sure customers have clean sheets. If you don't change the sheets or launder them properly, you're proba

Summary: Most of the activities IT performs aren't strategic. Like clean sheets, they're important, but not differentiating. How do you determine what's strategic? Evaluate the domains of your business.

Suppose you're in the hotel business. One of the things you have to do is make sure customers have clean sheets. If you don't change the sheets or launder them properly, you're probably not going to stay in business long. The bad news is that clean sheets are expensive and they don't differentiate you very much—all your competitors have clean sheets. You're stuck.

Consider the following graph plotting the cost of any given business decision against the competitive advantage it brings:

Feature Cost vs Differentiation (click to enlarge)

To the right in this diagram are the things we'd call strategic, representing features or practices that differentiate the organization from its competitors. The bottom half of the diagram contains the things that are relatively less expensive. Clearly if you're making a decision on what features to implement, you want to be in the lower right quadrant: low cost and high differentiation. Do those first.

The red quadrant seems like the last place you'd look for features, or is it? Think about clean sheets. As I noted earlier, clean sheets cost a lot of money and everyone has them, so there's not much competitive advantage in having them. But there's a huge competitive disadvantage if you don't. No one can do without clean sheets. Businesses are filled with things like clean sheets. For IT, things like availability, security, networks, and deployment are all clean sheets. Doing these well can differentiate you from those who don't, but they're not strategic. You still need a strategy.

How can you discover the things that really matter? I'm a fan of domain driven design. Domain driven design is a tool for looking at the various domains your business is engaged in and then determining which are core (differentiating), supporting, or merely generic. The things you identify as core are strategic—places you can differentiate yourself. This helps because now you know where to build and where to buy. Generic domains aren't unimportant (think HR or finance, for example), they're simply not strategic. And therefore buying those features is likely going to give you high availability and feature fit for far less money than doing it yourself.

On the other hand, domains that are core are the places you differentiate yourself. When you look at your organization's values, mission, and objectives, the core domains ought to directly support them. If you outsource these, then anyone else can do what you're doing. Core, strategic activities are places where it makes sense to build rather than buy. Spend your time and resources there. But don't neglect the sheets.

Domain-Driven Design Distilled by Vaughn Vernon

Concise, readable, and actionable, Domain-Driven Design Distilled never buries you in detail–it focuses on what you need to know to get results. Vaughn Vernon, author of the best-selling Implementing Domain-Driven Design, draws on his twenty years of experience applying DDD principles to real-world situations. He is uniquely well-qualified to demystify its complexities, illuminate its subtleties, and help you solve the problems you might encounter.

Photo Credit: Sheets from pxfuel (Free for commercial use)

Tags: strategy management cio


Simon Willison

Product Hunt Engineering Principles

Product Hunt Engineering Principles Product Hunt implement "Collaborative Single Player Mode", which they define as "A developer should be able to execute a feature from start to finish -- from the database to the backend, API, frontend, and CSS. The goal is never to get blocked." I've encountered this principle applied to teams before (which I really like) but not for individual developers, whi

Product Hunt Engineering Principles

Product Hunt implement "Collaborative Single Player Mode", which they define as "A developer should be able to execute a feature from start to finish -- from the database to the backend, API, frontend, and CSS. The goal is never to get blocked." I've encountered this principle applied to teams before (which I really like) but not for individual developers, which I imagine is more likely to work well for smaller organizations. Intriguing approach.

They also practice trunk driven development with feature flags: "Always start a feature with a feature flag and try to get something to production on day 1."

And "If a product decision is missing, try to make this decision yourself - it's better to ask for forgiveness rather than permission."

Via @mscccc


Damien Bod

Improving application security in ASP.NET Core Razor Pages using HTTP headers – Part 1

This article shows how to improve the security of an ASP.NET Core Razor Page application by adding security headers to all HTTP Razor Page responses. The security headers are added using the NetEscapades.AspNetCore.SecurityHeaders Nuget package from Andrew Lock. The headers are used to protect the session, not for authentication. The application is authenticated using Open […]

This article shows how to improve the security of an ASP.NET Core Razor Page application by adding security headers to all HTTP Razor Page responses. The security headers are added using the NetEscapades.AspNetCore.SecurityHeaders Nuget package from Andrew Lock. The headers are used to protect the session, not for authentication. The application is authenticated using Open ID Connect, the security headers are used to protected the session.

Code: https://github.com/damienbod/AspNetCore6Experiments

Blogs in this series

Improving application security in ASP.NET Core Razor Pages using HTTP headers – Part 1 Improving application security in Blazor using HTTP headers – Part 2 Improving application security in an ASP.NET Core API using HTTP headers – Part 3

The NetEscapades.AspNetCore.SecurityHeaders and the NetEscapades.AspNetCore.SecurityHeaders.TagHelpers Nuget packages are added to the csproj file of the web application. The tag helpers are added to use the nonce from the CSP in the Razor Pages.

<ItemGroup> <PackageReference Include="NetEscapades.AspNetCore.SecurityHeaders" Version="0.16.0" /> <PackageReference Include="NetEscapades.AspNetCore.SecurityHeaders.TagHelpers" Version="0.16.0" /> </ItemGroup>

What each header protects against is explained really good on the securityheaders.com results view of your test. You can click different links of each validation which takes you to the documentation of each header. The links at the bottom of this post provide some excellent information about what and why of the headers.

The security header definitions are added using the HeaderPolicyCollection class. I added this to a separate class to keep the Startup class small, where the middleware is added. I passed a boolean parameter into the method which is used to add or remove the HSTS header. We might not want to add this to local development and block all non HTTPS requests to localhost.

The policy defined in this demo is for Razor Page applications with as much blocked as possible. You should be able to re-use this in your projects.

The COOP (Cross Origin Opener Policy), COEP (Cross Origin Embedder Policy), CORP (Cross Origin Resource Policy) headers are relatively new. You might need to update you existing application deployments with these. The links at the bottom of the post provide information about these headers in detail and I would recommend reading these.

public static HeaderPolicyCollection GetHeaderPolicyCollection(bool isDev) { var policy = new HeaderPolicyCollection() .AddFrameOptionsDeny() .AddXssProtectionBlock() .AddContentTypeOptionsNoSniff() .AddReferrerPolicyStrictOriginWhenCrossOrigin() .RemoveServerHeader() .AddCrossOriginOpenerPolicy(builder => { builder.SameOrigin(); }) .AddCrossOriginEmbedderPolicy(builder => { builder.RequireCorp(); }) .AddCrossOriginResourcePolicy(builder => { builder.SameOrigin(); }) .AddContentSecurityPolicy(builder => { builder.AddObjectSrc().None(); builder.AddBlockAllMixedContent(); builder.AddImgSrc().Self().From("data:"); builder.AddFormAction().Self(); builder.AddFontSrc().Self(); builder.AddStyleSrc().Self(); // .UnsafeInline(); builder.AddBaseUri().Self(); builder.AddScriptSrc().UnsafeInline().WithNonce(); builder.AddFrameAncestors().None(); }) .RemoveServerHeader() .AddPermissionsPolicy(builder => { builder.AddAccelerometer().None(); builder.AddAutoplay().None(); builder.AddCamera().None(); builder.AddEncryptedMedia().None(); builder.AddFullscreen().All(); builder.AddGeolocation().None(); builder.AddGyroscope().None(); builder.AddMagnetometer().None(); builder.AddMicrophone().None(); builder.AddMidi().None(); builder.AddPayment().None(); builder.AddPictureInPicture().None(); builder.AddSyncXHR().None(); builder.AddUsb().None(); }); if (!isDev) { // maxage = one year in seconds policy.AddStrictTransportSecurityMaxAgeIncludeSubDomains(maxAgeInSeconds: 60 * 60 * 24 * 365); } return policy; }

In the Startup class, the UseSecurityHeaders method is used to apply the HTTP headers policy and add the middleware to the application. The env.IsDevelopment() is used to add or not to add the HSTS header. The default HSTS middleware from the ASP.NET Core templates was removed from the Configure method as this is not required.

public void Configure(IApplicationBuilder app, IWebHostEnvironment env) { if (env.IsDevelopment()) { app.UseDeveloperExceptionPage(); } else { app.UseExceptionHandler("/Error"); } app.UseSecurityHeaders( SecurityHeadersDefinitions .GetHeaderPolicyCollection(env.IsDevelopment()));

The server header can be removed in the program class if using Kestrel. If using IIS, you probably need to use the web.config to remove this.

public static IHostBuilder CreateHostBuilder(string[] args) => Host.CreateDefaultBuilder(args) .ConfigureWebHostDefaults(webBuilder => { webBuilder .ConfigureKestrel(options => options.AddServerHeader = false) .UseStartup<Startup>(); });

We want to apply the CSP nonce to all our scripts in the Razor Pages. We can add the NetEscapades.AspNetCore.SecurityHeaders namespace to the _ViewImports.cshtml file.

@using AspNetCoreRazor @using NetEscapades.AspNetCore.SecurityHeaders @namespace AspNetCoreRazor.Pages @addTagHelper *, Microsoft.AspNetCore.Mvc.TagHelpers

In the Razor Page _Layout, the CSP nonce can be used. I had to get the nonce from the HttpContext and add this to all scripts. This way, the scripts will be loaded. If the nonce does not match or is not applied to the script, it will not be loaded due to the CSP definition. I’ll ping Andrew Lock to see if the tag helper could be used directly.

@{var nonce = Context.GetNonce();} <script src="~/lib/jquery/dist/jquery.min.js" nonce="@nonce"></script> <script src="~/lib/bootstrap/dist/js/bootstrap.bundle.min.js" nonce="@nonce"></script> <script src="~/js/site.js" asp-append-version="true" nonce="@nonce"></script> @await RenderSectionAsync("Scripts", required: false) </body> </html>

The Http security headers can be tested using https://securityheaders.com

I used ngrok to test this before I deployed the application. The Security headers scans the web application and returns a really neat summary to what headers you have and how good it finds them. Each header has a link to a excellent documentation, blog on Scott Helme‘s website https://scotthelme.co.uk

The CSP can be tested using the https://csp-evaluator.withgoogle.com from google. The CSP evaluator gives an excellent summary and also suggestions how to improve the CSP.

The Razor Page application security can be much improved by added the headers to the application. The NetEscapades.AspNetCore.SecurityHeaders Nuget package makes it incredibly easy to apply this. I will create a follow up blog with a policy definition for a Blazor application and also a for an Web API application.

Notes:

If the application is fully protected without any public views, the follow redirects checkbox on the security headers needs to be disabled as then you only get the results of the identity provider used to authenticate.

I block all traffic, if possible, which is not from my domain including sub domains. If implementing enterprise applications, I would always do this. If implementing public facing applications with high traffic volumes or need extra fast response times, or need to reduce the costs of hosting, then CDNs would need to be used, allowed and so on. Try to block all first and open up as required and maybe you can avoid some nasty surprises from all the Javascript, CSS frameworks used.

Links:

https://securityheaders.com/

https://csp-evaluator.withgoogle.com/

Security by Default Chrome developers

A Simple Guide to COOP, COEP, CORP, and CORS

https://github.com/andrewlock/NetEscapades.AspNetCore.SecurityHeaders

https://github.com/dotnet/aspnetcore/issues/34428

https://w3c.github.io/webappsec-trusted-types/dist/spec/

https://web.dev/trusted-types/

https://developer.mozilla.org/en-US/docs/Web/HTTP/Cross-Origin_Resource_Policy_(CORP)

https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS

https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies

https://docs.google.com/document/d/1zDlfvfTJ_9e8Jdc8ehuV4zMEu9ySMCiTGMS9y0GU92k/edit

https://scotthelme.co.uk/coop-and-coep/

https://github.com/OWASP/ASVS

Sunday, 15. August 2021

Doc Searls Weblog

Beyond the Web

The Web is a haystack. This isn’t what Tim Berners-Lee had in mind when he invented the Web. Nor is it what Jerry Yang and David Filo had in mind when they invented Jerry and David’s Guide to the World Wide Web, which later became Yahoo. Jerry and David’s model for the Web was a library, and […]

The Web is a haystack.

This isn’t what Tim Berners-Lee had in mind when he invented the Web. Nor is it what Jerry Yang and David Filo had in mind when they invented Jerry and David’s Guide to the World Wide Web, which later became Yahoo. Jerry and David’s model for the Web was a library, and Yahoo was to be catalog for it. This made sense, given the prevailing conceptual frames for the Web at the time: real estate and publishing. Both are still with us today. We frame the Web as real estate when we speak of “sites” with “locations” in “domains” with “addresses” you can “visit” and “browse” through stuff called “files” and “pages,” which we “author,” “edit,” “post,” “publish,” “syndicate” and store in “folders” within a “directory.” Both frames suggest durability, if not permanence. Again, kind of like a library.

But once we added personal movement (“surf,” “browse”) and a vehicle for it (the browser), the Web became a World Wide Free-for-all. Literally. Anyone could publish, change and remove whatever they pleased, whenever they pleased. The same went for organizations of every kind, all over the world. And everyone with a browser could find their way to and through all of those spaces and places, and enjoy whatever “content” publishers chose to put there. Thus the Web grew into billions of sites, pages, images, databases, videos, and other stuff, with most of it changing constantly.

The result was a heaving heap of fuck-all.*

How big is it? According to WorldWebSize.comGoogle currently indexes about 41 billion pages, and Bing about 9 billion. They also peaked together at about 68 billion pages in late 2019. The Web is surely larger than that, but that’s the practical limit because search engines are the practical way to find pieces of straw in that thing. Will the haystack be less of one when approached by other search engines, such as the new ad-less (subscription-funded) Neeva? Nope. Search engines do not give the Web a card catalog. They certify its nature as a haystack.

So that’s one practical limit. There are others, but they’re hard to see when the level of optionality on the Web is almost indescribably vast. But we can see a few limits by asking some questions:

Why do you always have to accept websites’ terms? And why do you have no record of your own of what you accepted, or when‚ or anything? Why do you have no way to proffer your own terms, to which websites can agree? Why did Do Not Track, which was never more than a polite request not to be tracked off a website, get no respect from 99.x% of the world’s websites? And how the hell did Do Not Track turn into the Tracking Preference Expression at the W2C, where the standard never did get fully baked? Why, after Do Not Track failed, did hundreds of millions—or perhaps billions—of people start blocking ads, tracking or both, on the Web, amounting to the biggest boycott in world history? And then why did the advertising world, including nearly all advertisers, their agents, and their dependents in publishing, treat this as a problem rather than a clear and gigantic message from the marketplace? Why are the choices presented to you by websites called your choices, when all those choices are provided by them? And why don’t you give them choices? Why would Apple’s way of making you private on your phone be to “Ask App Not to Track,” rather than “Tell App Not to Track,” or “Prevent App From Tracking You”? Why does the GDPR call people “data subjects” rather than people, or human beings, and then assign the roles “data controller” and “data processor” only to other parties?* Why are nearly all the 200+million results in a search for GDPR+compliance about how companies can obey the letter of the law while violating its spirit by continuing to track people through the giant loophole you see in every cookie notice? Why does the CCPA give you the right to ask to have back personal data others have gathered about you on the Web, rather than forbid its collection in the first place? (Imagine a law that assumes that all farmers’ horses are gone from their barns, but gives those farmers a right to demand horses back from those who took them. It’s kinda like that.) Why, 22 years after The Cluetrain Manifesto said, we are not seats or eyeballs or end users or consumers. we are human beings and our reach exceeds your grasp. deal with it. —is that statement still not true? Why, 9 years after Harvard Business Review Press published The Intention Economy: When Customers Take Charge, has that not happened? (Really, what are you in charge of in the marketplace that isn’t inside companies’ silos and platforms?)

It’s easy to blame the cookie, which Lou Montulli invented in 1994 as a way for sites to remember their visitors by planting reminder files—cookies—in visitors’ browsers. Cookies also gave visitors a way to remember where they were when they last visited. For sites that require logins, cookies take care of that as well.

What matters, however, is not the cookie. It’s what makes the cookie necessary in the first place: the Web’s architecture. It’s called client-server, and is represented graphically like this:

This architecture was born in the era of centralized mainframes, which “users” accessed through client devices called “dumb terminals”:

On the Web, as it was in the old mainframe world, we clients—mere users—are as subordinate to servers as are cattle to ranchers or slaves to masters. In the client-server paradigm, our agency—our ability to act with effect in the world—is restricted to what servers allow or provide for us. Our choices are what they provide. We are independent only to the degree that we can also be clients to other servers. In this paradigm, a free market is “your choice of captor.”

Want privacy? You have to ask for it. And, if you go to the trouble of doing that—which you have to do separately with every site and service you encounter (each a mainframe of its own)—your client doesn’t keep a record of what you “agreed” to. The server does. Good luck finding whatever it is the server or its third parties remember about that agreement.

Want to control how your data (or data about you) gets processed by the servers of the world? Good luck with that too. Again, Europe’s GDPR says “natural persons” are just “data subjects,” while “data controllers” and “data processors” are roles reserved for servers.

Want a shopping cart of your own to take from site to site? My wife asked for that in 1995. It’s still barely thinkable in 2021. Want a dashboard for your life where you can gather all your expenses, investments, property records, health information, calendars, contacts, and other personal information? She asked for that too, and we still don’t have it, except to the degree that large server operators (e.g. Google, Apple, Microsoft) give us pieces of it, hosted in their clouds, and rigged to keep you captive to their systems.

That’s why we don’t yet have an Internet of Things (IoT), but rather an Apple of Things, a Google of Things, and an Amazon of Things.

Is it possible to do stuff on the Web that isn’t client-server? Perhaps some techies among us can provide examples, but practically speaking, here’s what matters: If it’s not thinkable by the owners of the servers we depend on, it doesn’t get made.

From our position at the bottom of the Web’s haystack, it’s hard to imagine there might be a world where it’s possible for us to have full agency: to not be just users of clients enslaved to as many servers as we deal with every day.

But that world exists. It’s called the Internet, and it can support a helluva lot more than the Web, with many ways to interact other than those possible in the client-server world alone.

Digital technology as we know it has only been around for a few decades, and the Internet for maybe half that time. Mobile computers that run apps and presume connectivity everywhere have only been with us for a decade or less. And all of those will be with us for many decades, centuries, or millennia to come. We are not going to stop living digital lives, any more than we are going to stop speaking, writing, or using mathematics. Digital technology and the Internet are granted wishes that won’t go back into the genie’s bottle.

So yes, the Web is wonderful, but not boundlessly so. It has limits. Thanks to the client-server architecture that prevails there, full personal agency is not a grace of life on the Web. For the thirty-plus years of the Web’s existence, and for its foreseeable future, we will never have more agency than its servers allow clients and users.

It’s time to think and build outside the haystack. Models for that do exist, and some have been around a long time.

Email, for example. While you can look at your email on the Web, or use a Web-based email service (such as Gmail), email itself is independent of those. My own searls.com email has been at servers in my home, on racks elsewhere, and in a hired cloud. I can move it anywhere I want. You can move yours as well. All the services I hire to host my email are substitutable. That’s just one way we can enjoy full agency on the Internet.

My own work outside the Web is currently happening at Customer Commons, on what we call the Byway. Go there and follow along as we work to toward better answers to the questions above than you’ll get from inside the haystack.

*I originally had “heaving haystack of fuck-all” here, but some remember it as the more alliterative “heaving heap of fuck-all.” So I decided to swap them. If comments actually worked here, I’d ask for a vote. But feel free to write me instead, at first name at last name dot com.


Jon Udell

Postgres and JSON: Finding document hotspots (part 1)

One of the compelling aspects of modern SQL is the JSON support built into modern engines, including Postgres. The documentation is well done, but I need examples to motivate my understanding of where and how and why to use such a capability. The one I’ll use in this episode is something I call document hotspots. … Continue reading Postgres and JSON: Finding document hotspots (part 1)

One of the compelling aspects of modern SQL is the JSON support built into modern engines, including Postgres. The documentation is well done, but I need examples to motivate my understanding of where and how and why to use such a capability. The one I’ll use in this episode is something I call document hotspots.

Suppose a teacher has asked her students to annotate Arthur Miller’s The Crucible. How can she find the most heavily-annotated passages? They’re visible in the Hypothesis client, of course, but may be sparsely distributed. She can scroll through the 154-page PDF document to find the hotspots, but it will be helpful to see a report that brings them together. Let’s do that.

The Hypothesis system stores annotations using a blend of SQL and JSON datatypes. Consider this sample annotation:

When the Hypothesis client creates that annotation it sends a JSON payload to the server. Likewise, when the client subsequently requests the annotation in order to anchor it to the document, it receives a similar JSON payload.

{ "id": "VLUhcP1-EeuHn5MbnGgJ0w", "created": "2021-08-15T04:07:39.343275+00:00", "updated": "2021-08-15T04:07:39.343275+00:00", "user": "acct:judell@hypothes.is", "uri": "https://ia800209.us.archive.org/17/items/TheCrucibleFullText/The%20Crucible%20full%20text.pdf", "text": "\"He is no one's favorite clergyman.\" :-)\n\nhttps://www.thoughtco.com/crucible-character-study-reverend-parris-2713521", "tags": [], "group": "__world__", "permissions": { "read": [ "group:__world__" ], "admin": [ "acct:judell@hypothes.is" ], "update": [ "acct:judell@hypothes.is" ], "delete": [ "acct:judell@hypothes.is" ] }, "target": [ { "source": "https://ia800209.us.archive.org/17/items/TheCrucibleFullText/The%20Crucible%20full%20text.pdf", "selector": [ { "end": 44483, "type": "TextPositionSelector", "start": 44392 }, { "type": "TextQuoteSelector", "exact": " am not some preaching farmer with a book under my arm; I am a graduate of Harvard College.", "prefix": " sixty-six pound, Mr. Proctor! I", "suffix": " Giles: Aye, and well instructed" } ] } ], "document": { "title": [ "The%20Crucible%20full%20text.pdf" ] }, "links": { "html": "https://hypothes.is/a/VLUhcP1-EeuHn5MbnGgJ0w", "incontext": "https://hyp.is/VLUhcP1-EeuHn5MbnGgJ0w/ia800209.us.archive.org/17/items/TheCrucibleFullText/The%20Crucible%20full%20text.pdf", "json": "https://hypothes.is/api/annotations/VLUhcP1-EeuHn5MbnGgJ0w" }, "user_info": { "display_name": "Jon Udell" }, "flagged": false, "hidden": false }

The server mostly shreds this JSON into conventional SQL types. The tags array, for example, is hoisted out of the JSON into a SQL array-of-text. The expression to find its length is a conventional Postgres idiom: array_length(tags,1). Note the second parameter; array_length(tags) is an error, because Postgres arrays can be multidimensional. In this case there’s only one dimension but it’s still necessary to specify that.

A target_selectors column, though, is retained as JSON. These selectors define how an annotation anchors to a target selection in a document. Because selectors are used only by the Hypothesis client, which creates and consumes them in JSON format, there’s no reason to shred them into separate columns. In normal operation, selectors don’t need to be related to core tables. They can live in the database as opaque blobs of JSON.

For some analytic queries, though, it is necessary to peer into those blobs and relate their contents to core tables. There’s a parallel set of functions for working with JSON. For example, the target_selectors column corresponds to the target[0]['selector'] array in the JSON representation. The expression to find the length of that array is jsonb_array_length(target_selectors).

Here’s a similar expression that won’t work: json_array_length(target_selectors). Postgres complains that the function doesn’t exist.

ERROR: function json_array_length(jsonb) does not exist Hint: No function matches the given name and argument types.

In fact both functions, json_array_length and jsonb_array_length, exist. But Postgres knows the target_selectors column is of type jsonb, not json which is the correct type for the json_array_length function. What’s the difference between json and jsonb?

The json and jsonb data types accept almost identical sets of values as input. The major practical difference is one of efficiency. The json data type stores an exact copy of the input text, which processing functions must reparse on each execution; while jsonb data is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing is needed. jsonb also supports indexing, which can be a significant advantage.

https://www.postgresql.org/docs/12/datatype-json.html

Although I tend to use JSON to refer to data in a variety of contexts, the flavor of JSON in the Postgres queries, views, and functions I’ll discuss will always be jsonb. The input conversion overhead isn’t a problem for analytics work that happens in a data warehouse, and the indexing support is a tremendous enabler.

To illustrate some of the operators common to json and jsonb, here is a query that captures the target_selectors column from the sample annotation.

with example as ( select id, target_selectors as selectors from annotation where id = '54b52170-fd7e-11eb-879f-931b9c6809d3' ) select * from example;

Here are some other queries against example

select selectors from example; [{"end": 44483, "type": "TextPositionSelector", "start": 44392}, { ... } ]

The result is a human-readable representation, but the type of selectors is jsonb.

select pg_typeof(selectors) from example; jsonb

The array-indexing operator, ->, can yield the zeroth element of the array.

select selectors->0 from example; {"end": 44483, "type": "TextPositionSelector", "start": 44392}

The result is again a human-readable representation of a jsonb type.

select pg_typeof(selectors->0) from example; jsonb

Another array-indexing operator, ->>, can also yield the zeroth element of the array, but now as type text.

select selectors->>0 from example; {"end": 44483, "type": "TextPositionSelector", "start": 44392}

The result looks the same, but the type is different.

select pg_typeof(selectors->>0) from example; text

The -> and ->> operators can also index objects by their keys. These examples work with the object that is the zeroth element of the array.

select selectors->0->'type' from example; "TextPositionSelector" select pg_typeof(selectors->0->'type') from example; jsonb select selectors->0->>'type' from example; TextPositionSelector select pg_typeof(selectors->0->>'type') from example; text

The Hypothesis system stores the location of a target (i.e, the selection in a document to which an annotation refers) in the target_selectors column we’ve been exploring. It records selectors. TextQuoteSelector represents the selection as the exact highlighted text bracketed by snippets of context. TextPositionSelector represents it as a pair of numbers that mark the beginning and end of the selection. When one range formed by that numeric pair is equal to another, it means two students have annotated the same selection. When a range contains another range, it means one student annotated the containing range, and another student made an overlapping annotation on the contained range. We can use these facts to surface hotspots where annotations overlap exactly or in nested fashion.

To start, let’s have a function to extract the start/end range from an annotation. In a conventional programming language you might iterate through the selectors in the target_selectors array looking for the one with the type TextPositionSelector. That’s possible in pl/pgsql and pl/python, but Postgres affords a more SQL-oriented approach. Given a JSON array, the function jsonb_array_elements returns a table-like object with rows corresponding to array elements.

select jsonb_array_elements(selectors) from example; {"end": 44483, "type": "TextPositionSelector", "start": 44392} {"type": "TextQuoteSelector", "exact": " am not some preaching farmer with a book under my arm; I am a graduate of Harvard College.", "prefix": " sixty-six pound, Mr. Proctor! I", "suffix": " Giles: Aye, and well instructed"}

A function can convert the array to rows, select the row of interest, select the start and end values from the row, package the pair of numbers as an array, and return the array.

create function position_from_anno(_id uuid) returns numeric[] as $$ declare range numeric[]; begin with selectors as ( select jsonb_array_elements(target_selectors) as selector from annotation where id = _id ), position as ( select selector->>'start'::numeric as startpos, selector->>'end'::numeric as endpos from selectors where selector->>'type' = 'TextPositionSelector' ) select array[p.startpos, p.endpos] from position p into range; return range; end; $$ language plpgsql;

Using it for the sample annotation:

select position_from_anno('54b52170-fd7e-11eb-879f-931b9c6809d3') position_from_anno ------------------ {44392,44483}

I’ll show how to use position_from_anno to find document hotspots in a later episode. The goal here is just to introduce an example, and to illustrate a few of the JSON functions and operators.

What’s most interesting, I think, is this part.

where selector->>'type' = 'TextPositionSelector'

Although the TextPositionSelector appears as the first element of the selectors array, that isn’t guaranteed. In a conventional language you’d have to walk through the array looking for it. SQL affords a declarative way to find an element in a JSON array.


1 https://blog.jonudell.net/2021/07/21/a-virtuous-cycle-for-analytics/
2 https://blog.jonudell.net/2021/07/24/pl-pgsql-versus-pl-python-heres-why-im-using-both-to-write-postgres-functions/
3 https://blog.jonudell.net/2021/07/27/working-with-postgres-types/
4 https://blog.jonudell.net/2021/08/05/the-tao-of-unicode-sparklines/
5 https://blog.jonudell.net/2021/08/13/pl-python-metaprogramming/
6 https://blog.jonudell.net/2021/08/15/postgres-and-json-finding-document-hotspots-part-1/
7 https://blog.jonudell.net/2021/08/19/postgres-set-returning-functions-that-self-memoize-as-materialized-views/
8 https://blog.jonudell.net/2021/08/21/postgres-functional-style/
9 https://blog.jonudell.net/2021/08/26/working-in-a-hybrid-metabase-postgres-code-base/
10 https://blog.jonudell.net/2021/08/28/working-with-interdependent-postgres-functions-and-materialized-views/
11 https://blog.jonudell.net/2021/09/05/metabase-as-a-lightweight-app-server/
12 https://blog.jonudell.net/2021/09/07/the-postgres-repl/

Saturday, 14. August 2021

Simon Willison

Quoting Chris Jones, original Internet Explorer team

There’s three ways to handle work assigned to you. If you say you’ll do it, do it. If you say you can’t, that’s ok. But if you sign up for work and drop the ball, the team fails. Learn to say no. — Chris Jones, original Internet Explorer team

There’s three ways to handle work assigned to you. If you say you’ll do it, do it. If you say you can’t, that’s ok. But if you sign up for work and drop the ball, the team fails. Learn to say no.

Chris Jones, original Internet Explorer team


Datasette on Codespaces, sqlite-utils API reference documentation and other weeknotes

This week I broke my streak of not sending out the Datasette newsletter, figured out how to use Sphinx for Python class documentation, worked out how to run Datasette on GitHub Codespaces, implemented Datasette column metadata and got tantalizingly close to a solution for an elusive Datasette feature. API reference documentation for sqlite-utils using Sphinx I've never been a big fan of Javado

This week I broke my streak of not sending out the Datasette newsletter, figured out how to use Sphinx for Python class documentation, worked out how to run Datasette on GitHub Codespaces, implemented Datasette column metadata and got tantalizingly close to a solution for an elusive Datasette feature.

API reference documentation for sqlite-utils using Sphinx

I've never been a big fan of Javadoc-style API documentation: I usually find that documentation structured around classes and methods fails to show me how to actually use those classes to solve real-world problems. I've tended to avoid it for my own projects.

My sqlite-utils Python library has a ton of functionality, but it mainly boils down to two classes: Database and Table. Since it already has pretty comprehesive narrative documentation explaining the different problems it can solve, I decided to try experimenting with the Sphinx autodoc module to produce some classic API reference documentation for it:

Since autodoc works from docstrings, this was also a great excuse to add more comprehensive docstrings and type hints to the library. This helps tools like Jupyter notebooks and VS Code display more useful inline help.

This proved to be time well spent! Here's what sqlite-utils looks like in VS Code now:

Running mypy against the type hints also helped me identify and fix a couple of obscure edge-case bugs in the existing methods, detailed in the 3.15.1 release notes. It's taken me a few years but I'm finally starting to come round to Python's optional typing as being worth the additional effort!

Figuring out how to use autodoc in Sphinx, and then how to get the documentation to build correctly on Read The Docs took some effort. I wrote up what I learned in this TIL.

Datasette on GitHub Codespaces

GitHub released their new Codespaces online development environments to general availability this week and I'm really excited about it. I ran a team at Eventbrite for a while resonsible for development environment tooling and it really was shocking how much time and money was lost to broken local development environments, even with a significant amount of engineering effort applied to the problem.

Codespaces promises a fresh, working development environment on-demand any time you need it. That's a very exciting premise! Their detailed write-up of how they convinced GitHub's own internal engineers to move to it is full of intriguing details - getting an existing application working with it is no small feat, but the pay-off looks very promising indeed.

So... I decided to try and get Datasette running on it. It works really well!

You can run Datasette in any Codespace environment using the following steps:

Open the terminal. Three-bar-menu-icon, View, Terminal does the trick. In the terminal run pip install datasette datasette-x-forwarded-host (more on this in a moment). Run datasette - Codespaces will automatically setup port forwarding and give you a link to "Open in Browser" - click the link and you're done!

You can pip install sqlite-utils and then use sqlite-utils insert to create SQLite databases to use with Datasette.

There was one catch: the first time I ran Datasette, clicking on any of the internal links within the web application took me to http://localhost/ pages that broke with a 404.

It turns out the Codespaces proxy sends a host: localhost header - which Datasette then uses to incorrectly construct internal URLs.

So I wrote a tiny ASGI plugin, datasette-x-forwarded-host, which takes the incoming X-Forwarded-Host provided by Codespaces and uses that as the Host header within Datasette itself. After that everything worked fine.

sqlite-utils insert --flatten

Early this week I finally figured out Cloud Run logging. It's actually really good! In doing so, I worked out a convoluted recipe for tailing the JSON logs locally and piping them into a SQLite database so that I could analyze them with Datasette.

Part of the reason it was convoluted is that Cloud Run logs feature nested JSON, but sqlite-utils insert only works against an array of flat JSON objects. I had to use this jq monstrosity to flatten the nested JSON into key/value pairs.

Since I've had to solve this problem a few times now I decided to improve sqlite-utils to have it do the work instead. You can now use the new --flatten option like so:

sqlite-utils insert logs.db logs log.json --flatten

To create a schema that flattens nested objects into a topkey_nextkey structure like so:

CREATE TABLE [logs] ( [httpRequest_latency] TEXT, [httpRequest_requestMethod] TEXT, [httpRequest_requestSize] TEXT, [httpRequest_status] INTEGER, [insertId] TEXT, [labels_service] TEXT );

Full documentation for --flatten.

Datasette column metadata

I've been wanting to add this for a while: Datasette's main branch now includes an implementation of column descriptions metadata for Datasette tables. This is best illustrated by a screenshot (of this live demo):

You can add the following to metadata.yml (or .json) to specify descriptions for the columns of a given table:

databases: fixtures: roadside_attractions: columns: name: The name of the attraction address: The street address for the attraction

Column descriptions will be shown in a <dl> at the top of the page, and will also be added to the menu that appears when you click on the cog icon at the top of a column.

Getting closer to query column metadata, too

Datasette lets you execute arbitrary SQL queries, like this one:

select roadside_attractions.name, roadside_attractions.address, attraction_characteristic.name from roadside_attraction_characteristics join roadside_attractions on roadside_attractions.pk = roadside_attraction_characteristics.attraction_id join attraction_characteristic on attraction_characteristic.pk = roadside_attraction_characteristics.characteristic_id

You can try that here. It returns the following:

name address name The Mystery Spot 465 Mystery Spot Road, Santa Cruz, CA 95065 Paranormal Winchester Mystery House 525 South Winchester Boulevard, San Jose, CA 95128 Paranormal Bigfoot Discovery Museum 5497 Highway 9, Felton, CA 95018 Paranormal Burlingame Museum of PEZ Memorabilia 214 California Drive, Burlingame, CA 94010 Museum Bigfoot Discovery Museum 5497 Highway 9, Felton, CA 95018 Museum

The columns it returns have names... but I've long wanted to do more with these results. If I could derive which source columns each of those output columns were, there are a bunch of interesting things I could do, most notably:

If the output column is a known foreign key relationship, I could turn it into a hyperlink (as seen on this table page) If the original table column has the new column metadata, I could display that as additional documentation

The challenge is: given an abitrary SQL query, how can I figure out what the resulting columns are going to be and how to tie those back to the original tables?

Thanks to a hint from the SQLite forum I'm getting tantalizingly close to a solution.

The trick is to horribly abuse SQLite's explain output. Here's what it looks like for the example query above:

addr opcode p1 p2 p3 p4 p5 comment 0 Init 0 15 0 0 1 OpenRead 0 47 0 2 0 2 OpenRead 1 45 0 3 0 3 OpenRead 2 46 0 2 0 4 Rewind 0 14 0 0 5 Column 0 0 1 0 6 SeekRowid 1 13 1 0 7 Column 0 1 2 0 8 SeekRowid 2 13 2 0 9 Column 1 1 3 0 10 Column 1 2 4 0 11 Column 2 1 5 0 12 ResultRow 3 3 0 0 13 Next 0 5 0 1 14 Halt 0 0 0 0 15 Transaction 0 0 35 0 1 16 Goto 0 1 0 0

The magic is on line 12: ResultRow 3 3 means "return a result that spans three columns, starting at register 3" - so that's register 3, 4 and 5. Those three registers are populated by the Column operations on line 9, 10 and 11 (the register they write into is in the p3 column). Each Column operation specifies the table (as p1) and the column index within that table (p2). And those table references map back to the OpenRead lines at the start, where p1 is that table register (referered to by Column) and p1 is the root page of the table within the schema.

Running select rootpage, name from sqlite_master where rootpage in (45, 46, 47) produces the following:

rootpage name 45 roadside_attractions 46 attraction_characteristic 47 roadside_attraction_characteristics

Tie all of this together, and it may be possible to use explain to derive the original tables and columns for each of the outputs of an arbitrary query!

I was almost ready to declare victory, until I tried running it against a query with an order by column at the end... and the results no longer matched up.

You can follow my ongoing investigation here - the short version is that I think I'm going to have to learn to decode a whole bunch more opcodes before I can get this to work.

This is also a very risk way of attacking this problem. The SQLite documentation for the bytecode engine includes the following warning:

This document describes SQLite internals. The information provided here is not needed for routine application development using SQLite. This document is intended for people who want to delve more deeply into the internal operation of SQLite.

The bytecode engine is not an API of SQLite. Details about the bytecode engine change from one release of SQLite to the next. Applications that use SQLite should not depend on any of the details found in this document.

So it's pretty clear that this is a highly unsupported way of working with SQLite!

I'm still tempted to try it though. This feature is very much a nice-to-have: if it breaks and the additional column context stops displaying it's not a critical bug - and hopefully I'll be able to ship a Datasette update that takes into account those breaking SQLite changes relatively shortly afterwards.

If I can find another, more supported way to solve this I'll jump on it!

In the meantime, I did use this technque to solve a simpler problem. Datasette extracts :named parameters from arbitrary SQL queries and turns them into form fields - but since it uses a simple regular expression for this it could be confused by things like a literal 00:04:05 time string contained in a SQL query.

The explain output for that query includes the following:

addr opcode p1 p2 p3 p4 p5 comment ... ... ... ... ... ... ... ... 27 Variable 1 12 0 :text 0

So I wrote some code which uses explain to extract just the p4 operands from Variable columns and treats those as the extracted parameters! This feels a lot safer than the more complex ResultRow/Column logic - and it also falls back to the regular expression if it runs into any SQL errors. More in the issue.

TIL this week Tailing Google Cloud Run request logs and importing them into SQLite Find local variables in the traceback for an exception Adding Sphinx autodoc to a project, and configuring Read The Docs to build it Releases this week datasette-x-forwarded-host: 0.1 - 2021-08-12
Treat the X-Forwarded-Host header as the Host header sqlite-utils: 3.15.1 - (84 releases total) - 2021-08-10
Python CLI utility and library for manipulating SQLite databases datasette-query-links: 0.1.2 - (3 releases total) - 2021-08-09
Turn SELECT queries returned by a query into links to execute them datasette: 0.59a1 - (96 releases total) - 2021-08-09
An open source multi-tool for exploring and publishing data datasette-pyinstrument: 0.1 - 2021-08-08
Use pyinstrument to analyze Datasette page performance

Friday, 13. August 2021

Simon Willison

Re-assessing the automatic charset decoding policy in HTTPX

Re-assessing the automatic charset decoding policy in HTTPX Tom Christie ran an analysis of the top 1,000 most accessed websites (according to an older extract from Google's Ad Planner service) and found that a full 5% of them both omitted a charset parameter and failed to decode as UTF-8. As a result, HTTPX will be depending on the charset-normalizer Python library to handle those cases.

Re-assessing the automatic charset decoding policy in HTTPX

Tom Christie ran an analysis of the top 1,000 most accessed websites (according to an older extract from Google's Ad Planner service) and found that a full 5% of them both omitted a charset parameter and failed to decode as UTF-8. As a result, HTTPX will be depending on the charset-normalizer Python library to handle those cases.

Via @starletdreaming


Moxy Tongue

Self-Sovereign Rights

Self-Sovereign Rights, And The Function of USER:ID In Rights & Derivative Laws For Mankind. People never required a definition of the word "People" until recently. A Governed Civil Society constituted "of, by, for" people, seems obvious in its implications for design, interpretation, and amendment procedures. But apparently, this is not the case. The data structure of participation in a "Gove

Self-Sovereign Rights, And The Function of USER:ID In Rights & Derivative Laws For Mankind.

People never required a definition of the word "People" until recently. A Governed Civil Society constituted "of, by, for" people, seems obvious in its implications for design, interpretation, and amendment procedures. But apparently, this is not the case. The data structure of participation in a "Governed Civil Society" is very dependent on some baseline, foundational accuracies about the source origin of "Sovereign Rights".

This is not optional anymore. People, being organized as "People" and written into databases, contracts, and Governed State Trusts, represented as both paper & digital artifacts, have become abstractions representing the original source authority required for the accurate processing of Sovereign authority. As these abstractions, written into documents via non-computational languages with historic precedent, become the priority mission for resolving "Civil Rights" in any Governed system designed "of, by, for" people, or "people", it is critical that the structure intended and inferred by language, become operational in the data structure that yields precedent in the real world where all Sovereign Rights matter. Rights are for people after all, not birth certificates.

Blood, mud, local community.. this is the home of Sovereign Rights. There is no alternative. It may seem as though digital technology is creating a new "metaverse" where new laws of nature are possible. Everything, as it will become obvious to understand in time, is made of data, and even emits data endlessly unto non-existence. People, you are the context. You are the reason data is being discovered, valued, harvested, manipulated, controlled, inspected, regulated, judicially reviewed, and entrepreneured in all the myriad ways possible to do, commercial, civic, social, health, education... the list is endless in time.

One thing never changes. One thing stays consistent. One thing matters for the sake of Humanity.

People own root authority, locally. Systems derived "of, by, for" people have a design constraint, and requirement for operational integrity. Words come and go, literature makes perfect what life makes real. You, an Individual person, are the only "people" that will ever actually exist. "We The People" may be a literary abstraction of reality, and it may be translated by a legal abstraction in time, but at the design and intent root level of it's meaning, "We The People" refers only to local people, in the real world, Individuals all.

That is the structural requirement. Our masses, our ability to coordinate as "We", or be perceived as "They", to form teams, to collaborate and build more perfect unions.. all of us, as Individuals, living among one another.. this is the only accurate structure of Sovereign Rights that are accurate to the design intent within a civil society derived "of, by, for" people, Individuals all.

Equality is a starting point of civil society. An accurate data structure yields accurate results in the hands of people. Individuals choose their path to happiness. Individuals choose to express their liberty. Individuals live life. Individuals are the only people alive on this planet, and at the data structure start of a Governed Civil System, all people are equal. Thats accurate law, thats accurate data structure.

Systems do not possess people. Systems are derived of, by for people. This is accurate fact. All formats of artificial intelligence known to mankind are derived from people, for people. Any data structure routine that runs counter to this accurate fact violates basic structural rights that people possess, as Individuals. Governments exist to enforce those Rights of, by for people, Individuals all.

In the year 2020++, there is great structural confusion about these words. One day, people will read these words and see them clearly. Clearly, they carry meaning. Clearly, they carry intent. All Human Rights are self-Sovereign in origin of authority, and this ultimate human authority is always locally expressed by Individuals. All limited liability structures derived by the consent of the people Governing Civil Society, and operationally participating in the accurate design of "Civil Society', are responsible for the return of value to the people, of, by, for and from which they emit. Any legal abstraction that removes the Sovereign Rights of people from "people" commits a crime of fraud on the people who civil society is "of, by, for", and their treatment under the derived Law of their Sovereign consent must transact "accurate authority" for people.

Currently, upon entering a unique relationship with Governed civil society via a USER:ID, discoverable within any IAM ID Management system in the world, and in use by Government processes to transact the Rights and permissions of citizens in their local purview, the data structure of your participation becomes "legal" and inaccurate. Governments, deriving their consent from the Governed, may design any data structure they wish in time. All Governments are to be held accountable for the accurate abstraction of human Rights and laws derived of, by for the people who produce it's authority of consent. People, Individuals All, is the only living reality.

Self-Sovereign Rights represent this accurate data structure of Governed Civil Society, whereby USER:ID is utilized to confer Rights Administration capabilities upon a populace of contextual people operating with local authority as Individual citizens, and non-citizen people with innate self-Sovereign Human Rights. A Nation such as America could not come into existence by any other data structure, and history itself serves as the ultimate precedent for establishing an operational context used in the very interpretation of Constitutional literature.

Working Groups, operating within non-governmental organizations or non-profit public benefit charities, exist as a layered dependency of such Governed systems. Their work is derived by the consent of the people, Individuals all, and as such, the accurate translation of their efforts is directly accountable to the accurate data structure outcomes of their local jurisdictional translations. America is not Europe, California is not New York. People, Individuals All, Always Local, Always Root Authority.

Data structure in a computational society Governed accurately "of, by, for" people requires local consent in all transactions and derived uses. The concept of an "open society" and a "secured society" can only co-exist if people are accurately represented. Otherwise, exploitation is the default condition of participation. Evidence can be found in the identity theft data, where babies with a social security number occupy a precarious artifact condition in a society construed as "Governed" and "Civil" as the #1 risk vector and systemic victim. The produced results of the current data structure of people in THE UNITED STATES OF AMERICA is a travesty currently, and does not reflect the accurate data structure of the American people. An error of omission sits at the base of the stack affecting how Rights and Laws are translated, and conditioned as necessary.

People, Individuals All, operationally accurate as a data structure derived of, by for self-Sovereign human authority, expressed locally by people. This is the goal of work groups; consent receipts are an example of such work. Their accurate translation of contextual jurisdictional meaning will only be possible with a starting foundation of accurate self-Sovereign Rights, expressed by people locally. Systems must conform to reality. 

People pursue lives of happiness, liberty, life... for people, in the real world, not as "people" abstractions required to conform to system designs that are computationally inaccurate. Data structures yield operational results. People own root authority in civil systems. It is not optional. Inaccuracy is unconstitutional, literally and operationally.

The Supreme Court is required to translate the meaning of the Law of THE UNITED STATES OF AMERICA, but has no self-comprehension of the inaccuracies present within the data structure of its jurisdictional authority. Systems don't think, systems don't innovate, systems don't iterate, amend, or get human intent correct in their design. People do, Individuals do, teams of Individuals do. Local people are the Self-Sovereign Rights holders and creators of Governed Civil Society, and in the absence of their accurate participation using an accurate data structure for expressing their human and civil Rights under Laws derived "of, by, for" their consent... well, welcome to 2020++ .. enjoy the loop.

An accurate operational data structure is a pre-requirement for the accurate translation of Sovereign Rights, Laws, and permissions administered in a civil system. All corporations and corporate systems are derived via the Governed process of Rights, Laws and permissions within a designed civil system, and the accurate translations of Laws affects the limitation of liability and expression of benefits for people within any and all local jurisdictions.

In short, until the data structure of the world is accurately represented in the Governed Civil Systems where USER:ID confers operational meaning to people, any civil system claiming to be derived "of, by, for" people is currently non-compliant, and unconstitutional in operational design. Self-Sovereign Rights are the origin of Governed Civil Systems derived "of, by, for" people, Individuals all. 

Data structure yields operational results.. people own root.

I repeat, People, Individuals All, own root authority in any/all "Civil Systems" of Rights/Law administration.






Jon Udell

pl/python metaprogramming

In episode 2 I mentioned three aspects of pl/python that are reasons to use it instead of pl/pgsql: access to Python modules, metaprogramming, and introspection. Although this episode focuses on metaprogramming — by which I mean using Python to dynamically compose and run SQL queries — my favorite example combines all three aspects. The context … Continue reading pl/python metaprogramming

In episode 2 I mentioned three aspects of pl/python that are reasons to use it instead of pl/pgsql: access to Python modules, metaprogramming, and introspection.

Although this episode focuses on metaprogramming — by which I mean using Python to dynamically compose and run SQL queries — my favorite example combines all three aspects.

The context for the example is an analytics dashboard with a dozen panels, each driven by a pl/plython function that’s parameterized by the id of a school or a course. So, for example, the Questions and Answers panel on the course dashboard is driven by a function, questions_and_answers_for_group(group_id), which wraps a SQL query that:

– calls another pl/python function, questions_for_group(group_id), to find notes in the group that contain question marks

– finds the replies to those notes

– builds a table that summarizes the question/answer pairs

Here’s the SQL wrapped by the questions_and_answers_for_group(group_id) function.

sql = f""" with questions as ( select * from questions_for_group('{_group_id}') ), ids_and_refs as ( select id, unnest ("references") as ref from annotation where groupid = '{_group_id}' ), combined as ( select q.*, array_agg(ir.id) as reply_ids from ids_and_refs ir inner join questions q on q.id = ir.ref group by q.id, q.url, q.title, q.questioner, q.question, q.quote ), unnested as ( select c.url, c.title, c.quote, c.questioner, c.question, unnest(reply_ids) as reply_id from combined c ) select distinct course_for_group('{_group_id}') as course, teacher_for_group('{_group_id}') as teacher, clean_url(u.url) as url, u.title, u.quote, u.questioner, (regexp_matches(u.question, '.+\?'))[1] as question, display_name_from_anno(u.reply_id) as answerer, text_from_anno(u.reply_id) as answer, app_host() || '/course/render_questions_and_answers/{_group_id}' as viewer from unnested u order by course, teacher, url, title, questioner, question

This isn’t yet what I mean by pl/python metaprogramming. You could as easily wrap this SQL code in a pl/pgsql function. More easily, in fact, because in pl/pgsql you could just write _group_id instead of '{_group_id}'.

To get where we’re going, let’s zoom out and look at the whole questions_and_and_answer_for_group(group_id) function.

questions_and_answers_for_group(_group_id text) returns setof question_and_answer_for_group as $$ from plpython_helpers import ( exists_group_view, get_caller_name, memoize_view_name ) base_view_name = get_caller_name() view_name = f'{base_view_name}_{_group_id}' if exists_group_view(plpy, view_name): sql = f""" select * from {view_name} """ else: sql = f""" <SEE ABOVE> """ memoize_view_name(sql, view_name) sql = f""" select * from {view_name} """ return plpy.execute(sql) $$ language plpython3u;

This still isn’t what I mean by metaprogramming. It introduces introspection — this is a pl/python function that discovers its own name and works with an eponymous materialized view — but that’s for a later episode.

It also introduces the use of Python modules by pl/python functions. A key thing to note here is that this is an example of what I call a memoizing function. When called it looks for a materialized view that captures the results of the SQL query shown above. If yes, it only needs to use a simple SELECT to return the cached result. If no, it calls memoize_view_name to run the underlying query and cache it in a materialized view that the next call to questions_and_answers_for_group(group_id) will use in a simple SELECT. Note that memoize_view_name is a special function that isn’t defined in Postgres using CREATE FUNCTION foo() like a normal pl/python function. Instead it’s defined using def foo() in a Python module called plpython_helpers. The functions there can do things — like create materialized views — that pl/python functions can’t. More about that in another episode.

The focus in this episode is metaprogramming, which is used in this example to roll up the results of multiple calls to questions_and_answers_for_group(group_id). That happens when the group_id refers to a course that has sections. If you’re teaching the course and you’ve got students in a dozen sections, you don’t want to look at a dozen dashboards; you’d much rather see everything on the primary course dashboard.

Here’s the function that does that consolidation.

create function consolidated_questions_and_answers_for_group(group_id text) returns setof question_and_answer_for_group as $$ from plpython_helpers import ( get_caller_name, sql_for_consolidated_and_memoized_function_for_group ) base_view_name = get_caller_name() sql = sql_for_consolidated_and_memoized_function_for_group( plpy, base_view_name, 'questions_and_answers_for_group', _group_id) sql += ' order by course, url, title, questioner, answerer' return plpy.execute(sql) $$ language plpython3u;

This pl/python function not only memoizes its results as above, it also consolidates results for all sections of a course. The memoization happens here.

def sql_for_consolidated_and_memoized_function_for_group(plpy, base_view_name, function, group_id): view_name = f'{base_view_name}_{group_id}' sql = f""" select exists_view('{view_name}') as exists """ exists = row_zero_value_for_colname(plpy, sql, 'exists') if exists: sql = f""" select * from {view_name} """ else: sql = consolidator_for_group_as_sql(plpy, group_id, function) memoize_view_name(sql, view_name) sql = f""" select * from {view_name} """ return sql

The consolidation happens here, and this is finally what I think of as classical metaprogramming: using Python to compose SQL.

def consolidator_for_group_as_sql(plpy, _group_id, _function): sql = f"select type_for_group('{_group_id}') as type" type = row_zero_value_for_colname(plpy, sql, 'type') if type == 'section_group' or type == 'none': sql = f"select * from {_function}('{_group_id}')" if type == 'course_group' or type == 'course': sql = f"select has_sections('{_group_id}')" has_sections = row_zero_value_for_colname(plpy, sql, 'has_sections') if has_sections: sql = f""" select array_agg(group_id) as group_ids from sections_for_course('{_group_id}') """ group_ids = row_zero_value_for_colname(plpy, sql, 'group_ids') selects = [f"select * from {_function}('{_group_id}') "] for group_id in group_ids: selects.append(f"select * from {_function}('{group_id}')") sql = ' union '.join(selects) else: sql = f"select * from {_function}('{_group_id}')" return sql

If the inbound _groupid is p1mqaeep, the inbound _function is questions_and_answers_for_group, and the group has no sections, the SQL will just be select * from questions_and_answers_for_group('p1mqaeep').

If the group does have sections, then the SQL will instead look like this:

select * from questions_and_answers_for_group('p1mqaeep') union select * from questions_and_answers_for_group('x7fe93ba') union select * from questions_and_answers_for_group('qz9a4b3d')

This is a very long-winded way of saying that pl/python is an effective way to compose and run arbitarily complex SQL code. In theory you could do the same thing using pl/pgsql, in practice it would be insane to try. I’ve entangled the example with other aspects — modules, introspection — because that’s the real situation. pl/python’s maximal power emerges from the interplay of all three aspects. That said, it’s a fantastic way to extend Postgres with user-defined functions that compose and run SQL code.


1 https://blog.jonudell.net/2021/07/21/a-virtuous-cycle-for-analytics/
2 https://blog.jonudell.net/2021/07/24/pl-pgsql-versus-pl-python-heres-why-im-using-both-to-write-postgres-functions/
3 https://blog.jonudell.net/2021/07/27/working-with-postgres-types/
4 https://blog.jonudell.net/2021/08/05/the-tao-of-unicode-sparklines/
5 https://blog.jonudell.net/2021/08/13/pl-python-metaprogramming/
6 https://blog.jonudell.net/2021/08/15/postgres-and-json-finding-document-hotspots-part-1/
7 https://blog.jonudell.net/2021/08/19/postgres-set-returning-functions-that-self-memoize-as-materialized-views/
8 https://blog.jonudell.net/2021/08/21/postgres-functional-style/
9 https://blog.jonudell.net/2021/08/26/working-in-a-hybrid-metabase-postgres-code-base/
10 https://blog.jonudell.net/2021/08/28/working-with-interdependent-postgres-functions-and-materialized-views/
11 https://blog.jonudell.net/2021/09/05/metabase-as-a-lightweight-app-server/
12 https://blog.jonudell.net/2021/09/07/the-postgres-repl/

Wednesday, 11. August 2021

Simon Willison

GitHub’s Engineering Team has moved to Codespaces

GitHub’s Engineering Team has moved to Codespaces My absolute dream development environment is one where I can spin up a new, working development environment in seconds - to try something new on a branch, or because I broke something and don't want to spend time figuring out how to fix it. This article from GitHub explains how they got there: from a half-day setup to a 45 minute bootstrap in a c

GitHub’s Engineering Team has moved to Codespaces

My absolute dream development environment is one where I can spin up a new, working development environment in seconds - to try something new on a branch, or because I broke something and don't want to spend time figuring out how to fix it. This article from GitHub explains how they got there: from a half-day setup to a 45 minute bootstrap in a codespace, then to five minutes through shallow cloning and a nightly pre-built Docker image and finally to 10 seconds be setting up "pools of codespaces, fully cloned and bootstrapped, waiting to be connected with a developer who wants to get to work".

Tuesday, 10. August 2021

MyDigitalFootprint

What happens when is tension is too low to improve performance or conflict too much to bare?

Right now, in your current situation, you are asked to or informed to, undertake or perform certain actions.  When these instructions and activities align with your natural instinct it is easy.  When there is a lack of alignment we know that there are tensions, contradictions, conflicts and compromises created.  Often these creep up on us, where one small thing leads to another unt
Right now, in your current situation, you are asked to or informed to, undertake or perform certain actions.  When these instructions and activities align with your natural instinct it is easy.  When there is a lack of alignment we know that there are tensions, contradictions, conflicts and compromises created.  Often these creep up on us, where one small thing leads to another until the snowball is in full avalanche mode.  

When you feel that comfort blanket of alignment, you should use the peak paradox framework to look for paradoxes, because you are aligned.  It does not mean everyone is aligned or you are doing the right thing, but it feels easy.  The sense of natural alignment is a conformational bias that is hard to fight, but it is time to focus on where paradoxes can exist.

When that comfort blanket is missing and we are in the open white waters of turmoil, the Peak Pardox framework is designated to unpack where and from who and what decision means you feel destructive tensions,  unresolvable conflicts and compromised.  


The framework will help us make (better) choices, decisions and judgement to resolve those stresses, either by moving/ “walking” to a different place or through communication and understanding.  It provides a passage where you can find a place to live with tensions we understand, conflicts we can manage and compromises we are prepared to live with.  The framework will not eradicate anything, it helps us explain and unpack; as we all face the same paradoxes with our own perceptions. 

The Importance of Tension

Creative tension, along with pressure created from constraints and “healthy” stress as we aim for our Northstar enables us to perform (better).  Indeed creative tension in the boardroom is healthy and leads to wider considerations, that improve decision-making leading to better outcomes.  A lack of creative tension, say as a result of too much power, compliance or consensus; creates conflicts, unhelpful tension, stress and individual compromises; which all lead to poor outcomes.

Finding the right balance is in part about leadership but also governance, oversight and the role of a chairperson.  As explored in previous posts, decisions at peak paradox are never good, because everyone has compromised and there is a lack of leadership.  It is because of this that “direction” is critical to find that place where we can align on decisions but equally have tension to ensure we are doing the right things, but don’t have to go back to basics at every meeting.  

This is a super article on “How the Afterglow of Victory Boosts Future Performance” by Miguel Sousa Lobo, INSEAD Associate Professor of Decision Sciences. August 2, 2021. The takeaway is that when we perform well together, we feel less tense with each other, and when we feel less tense with each other, we perform better.  Alignment works but we have to also look for the paradox or we will find ourselves in the comfort blanket of self-confirmation.  Keeping creative tension alive is hard as we are trying to find the balance between conflict and denial.   

Where does Peak Paradox fit into the strategic planning cycle?

Probably when there is either too much conflict or when there is not enough!




Monday, 09. August 2021

Phil Windley's Technometria

Ephemeral Relationships

Summary: Many of the relationships we have online don’t have to be long-lived. Ephemeral relationships, offering virtual anonymity, are technically possible without a loss of functionality or convenience. Why don’t they exist? Surveillance is profitable. In real life, we often interact with others—both people and institutions—with relative anonymity. For example, if I go the store a

Summary: Many of the relationships we have online don’t have to be long-lived. Ephemeral relationships, offering virtual anonymity, are technically possible without a loss of functionality or convenience. Why don’t they exist? Surveillance is profitable.

In real life, we often interact with others—both people and institutions—with relative anonymity. For example, if I go the store and use cash to buy a coke there is no exchange of identity information. Even if I use a credit card it's rarely the case that the entire transaction happens under the administrative authority of the identity system inherent in the credit card. Only the financial part of the transaction takes place in that identity system. This is true of most interactions in real life.

I don't have an account at the local grocery store where I store my address, credit card, and other information so that each transaction is linked to a record about me. True, many businesses have loyalty programs and use those to collect information about customers, but those are optional. And going without one doesn't significantly inconvenience me. In fact, the point of the credit card system is that it avoids long-lived relationships between any of the parties except the customer (or merchant) and their bank.

In real life, we do without identity systems for most things. You don't have to identify yourself to the movie theater to watch a movie or log into some administrative system to sit in a restaurant and have a private conversation with friends. In real life, we act as embodied, independent agents. Our physical presence and the laws of physics have a lot to do with our ability to function with workable anonymity across many domains.

One of the surprising things about identity in the physical world is that so many of the relationships are ephemeral rather than long-lived. While the ticket taker at the movies and the server at the restaurant certainly "identify" patrons, they forget them as soon as the transaction is complete. And the identification is likely pseudonymous (e.g. "the couple at table four" rather than "Phillip and Lynne Windley"). These interactions are effectively anonymous.

Of course, in the digital world, very few meaningful transactions are done outside of some administrative identity system. There are several reasons why identity is so important in the digital world. But we've accepted long-lived relationships with full legibility of patrons as the default on the web.

Some of that is driven by convenience. I like storing my credit cards and shipping info at Amazon because it's convenient. I like that they know what books I've bought so I don't buy the same book more than once (yeah, I'm that guy). But what if I could get that convenience without any kind of account at Amazon at all? That's the promise of verifiable credentials and self-sovereign identity.

You can imagine an ecommerce company that keeps no payment or address information on customers, but is still able to process their orders and send the merchandise. If my shipping information and credit card information are stored as verifiable credentials in a digital wallet I control, I can easily provide these to whatever web site I need to as needed. No need to have them stored. And we demonstrated way back in 2009 a way to augment results from a web site with a self-sovereign data store. That could tell me what I already own as I navigate a site.

There's no technical reason we need long-lived relationships for most of our web interactions. That doesn't mean we won't want some for convenience, but they ought to be optional, like the loyalty program at the supermarket, rather than required for service. Our digital lives can be as private as our physical lives if we choose for them to be. We don't have to allow companies to surveil us. And the excuse that they surveil us to provide better service is just that—an excuse. The real reason they surveil us is because it's profitable.

Photo Credit: Ghost Trees from David Lienhard (CC BY-SA 3.0)

Tags: identity ssi privacy relationships


Damien Bod

Send Emails using Microsoft Graph API and a desktop client

This article shows how to use Microsoft Graph API to send emails for a .NET Core Desktop WPF application. Microsoft.Identity.Client is used to authenticate using an Azure App registration with the required delegated scopes for the Graph API. The emails can be sent with text or html bodies and also with any file attachments uploaded […]

This article shows how to use Microsoft Graph API to send emails for a .NET Core Desktop WPF application. Microsoft.Identity.Client is used to authenticate using an Azure App registration with the required delegated scopes for the Graph API. The emails can be sent with text or html bodies and also with any file attachments uploaded in the WPF application.

Code: https://github.com/damienbod/EmailCalandarsClient

To send emails using Microsoft Graph API, you need to have an office license for the Azure Active Directory user which sends the email.

You can sign-in here to check this:

https://www.office.com

Setup the Azure App Registration

Before we can send emails using Microsoft Graph API, we need to create an Azure App registration with the correct delegated scopes. In our example, the URI http://localhost:65419 is used for the AAD redirect to the browser opened by the WPF application and this is added to the authentication configuration. Once created, the client ID of the Azure App registration is used in the app settings in the application as well as the tenant ID and the scopes.

You need to add the required scopes for the Graph API to send emails. These are delegated permissions, which can be accessed using the Add a permission menu.

The Mail.Send and the Mail.ReadWrite delegated scopes from the Microsoft Graph API are added to the Azure App registration.

To add these, scroll down through the items in the App a permission, Microsoft Graph API delegated scopes menu, check the checkboxes for the Mail.Send and the Mail.ReadWrite.

Desktop Application

The Microsoft.Identity.Client and the Microsoft.Identity.Web.MicrosoftGraphBeta Nuget packages are used to authenticate and use the Graph API. You probably could use the Graph API Nuget packages directly instead of Microsoft.Identity.Web.MicrosoftGraphBeta, I used this since I normally do web and it has everything required.

<ItemGroup> <PackageReference Include="Microsoft.Identity.Client" Version="4.35.1" /> <PackageReference Include="Microsoft.Identity.Web.MicrosoftGraphBeta" Version="1.15.2" /> <PackageReference Include="Newtonsoft.Json" Version="13.0.1" /> </ItemGroup>

The PublicClientApplicationBuilder class is used to define the redirect URL which matches the URL from the Azure App registration. The TokenCacheHelper class is the same as from the Microsoft examples.

public void InitClient() { _app = PublicClientApplicationBuilder.Create(ClientId) .WithAuthority(Authority) .WithRedirectUri("http://localhost:65419") .Build(); TokenCacheHelper.EnableSerialization(_app.UserTokenCache); }

The identity can authentication using the SignIn method. If a server session exists, a token is acquired silently otherwise an interactive flow is used.

public async Task<IAccount> SignIn() { try { var result = await AcquireTokenSilent(); return result.Account; } catch (MsalUiRequiredException) { return await AcquireTokenInteractive().ConfigureAwait(false); } } private async Task<IAccount> AcquireTokenInteractive() { var accounts = (await _app.GetAccountsAsync()).ToList(); var builder = _app.AcquireTokenInteractive(Scopes) .WithAccount(accounts.FirstOrDefault()) .WithUseEmbeddedWebView(false) .WithPrompt(Microsoft.Identity.Client.Prompt.SelectAccount); var result = await builder.ExecuteAsync().ConfigureAwait(false); return result.Account; } public async Task<AuthenticationResult> AcquireTokenSilent() { var accounts = await GetAccountsAsync(); var result = await _app.AcquireTokenSilent(Scopes, accounts.FirstOrDefault()) .ExecuteAsync() .ConfigureAwait(false); return result; }

The SendEmailAsync method uses a message object and Graph API to send the emails. If the identity has the permissions, the licenses and is authenticated, then an email will be sent using the definitions from the Message class.

public async Task SendEmailAsync(Message message) { var result = await AcquireTokenSilent(); _httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", result.AccessToken); _httpClient.DefaultRequestHeaders.Accept.Add( new MediaTypeWithQualityHeaderValue("application/json")); GraphServiceClient graphClient = new GraphServiceClient(_httpClient) { AuthenticationProvider = new DelegateAuthenticationProvider(async (requestMessage) => { requestMessage.Headers.Authorization = new AuthenticationHeaderValue("Bearer", result.AccessToken); }) }; var saveToSentItems = true; await graphClient.Me .SendMail(message, saveToSentItems) .Request() .PostAsync(); }

The EmailService class is used to added the recipient, header (subject) and the body to the message which represents the email. The attachments are added separately using the MessageAttachmentsCollectionPage class. The AddAttachment method is used to add as many attachments to the email as required which are uploaded as a base64 byte array. The service can send html bodies or text bodies.

public class EmailService { MessageAttachmentsCollectionPage MessageAttachmentsCollectionPage = new MessageAttachmentsCollectionPage(); public Message CreateStandardEmail(string recipient, string header, string body) { var message = new Message { Subject = header, Body = new ItemBody { ContentType = BodyType.Text, Content = body }, ToRecipients = new List<Recipient>() { new Recipient { EmailAddress = new EmailAddress { Address = recipient } } }, Attachments = MessageAttachmentsCollectionPage }; return message; } public Message CreateHtmlEmail(string recipient, string header, string body) { var message = new Message { Subject = header, Body = new ItemBody { ContentType = BodyType.Html, Content = body }, ToRecipients = new List<Recipient>() { new Recipient { EmailAddress = new EmailAddress { Address = recipient } } }, Attachments = MessageAttachmentsCollectionPage }; return message; } public void AddAttachment(byte[] rawData, string filePath) { MessageAttachmentsCollectionPage.Add(new FileAttachment { Name = Path.GetFileName(filePath), ContentBytes = EncodeTobase64Bytes(rawData) }); } public void ClearAttachments() { MessageAttachmentsCollectionPage.Clear(); } static public byte[] EncodeTobase64Bytes(byte[] rawData) { string base64String = System.Convert.ToBase64String(rawData); var returnValue = Convert.FromBase64String(base64String); return returnValue; } }

Azure App Registration settings

The app settings specific to your Azure Active Directory tenant and the Azure App registration values need to be added to the app settings in the .NET Core application. The Scope configuration is set to use the required scopes required to send emails.

<appSettings> <add key="AADInstance" value="https://login.microsoftonline.com/{0}/v2.0"/> <add key="Tenant" value="5698af84-5720-4ff0-bdc3-9d9195314244"/> <add key="ClientId" value="ae1fd165-d152-492d-b4f5-74209f8f724a"/> <add key="Scope" value="User.read Mail.Send Mail.ReadWrite"/> </appSettings>

WPF UI

The WPF application provides an Azure AD login for the identity. The user of the WPF application can sign-in using a browser which redirects to the AAD authentication page. Once authenticated, the user can send a html email or a text email. The AddAttachment method uses the OpenFileDialog to upload a file in the WPF application, get the raw bytes and add these to the attachments which are sent with the next email message. Once the email is sent, the attachments are removed.

public partial class MainWindow : Window { AadGraphApiDelegatedClient _aadGraphApiDelegatedClient = new AadGraphApiDelegatedClient(); EmailService _emailService = new EmailService(); const string SignInString = "Sign In"; const string ClearCacheString = "Clear Cache"; public MainWindow() { InitializeComponent(); _aadGraphApiDelegatedClient.InitClient(); } private async void SignIn(object sender = null, RoutedEventArgs args = null) { var accounts = await _aadGraphApiDelegatedClient.GetAccountsAsync(); if (SignInButton.Content.ToString() == ClearCacheString) { await _aadGraphApiDelegatedClient.RemoveAccountsAsync(); SignInButton.Content = SignInString; UserName.Content = "Not signed in"; return; } try { var account = await _aadGraphApiDelegatedClient.SignIn(); Dispatcher.Invoke(() => { SignInButton.Content = ClearCacheString; SetUserName(account); }); } catch (MsalException ex) { if (ex.ErrorCode == "access_denied") { // The user canceled sign in, take no action. } else { // An unexpected error occurred. string message = ex.Message; if (ex.InnerException != null) { message += "Error Code: " + ex.ErrorCode + "Inner Exception : " + ex.InnerException.Message; } MessageBox.Show(message); } Dispatcher.Invoke(() => { UserName.Content = "Not signed in"; }); } } private async void SendEmail(object sender, RoutedEventArgs e) { var message = _emailService.CreateStandardEmail(EmailRecipientText.Text, EmailHeader.Text, EmailBody.Text); await _aadGraphApiDelegatedClient.SendEmailAsync(message); _emailService.ClearAttachments(); } private async void SendHtmlEmail(object sender, RoutedEventArgs e) { var messageHtml = _emailService.CreateHtmlEmail(EmailRecipientText.Text, EmailHeader.Text, EmailBody.Text); await _aadGraphApiDelegatedClient.SendEmailAsync(messageHtml); _emailService.ClearAttachments(); } private void AddAttachment(object sender, RoutedEventArgs e) { var dlg = new OpenFileDialog(); if (dlg.ShowDialog() == true) { byte[] data = File.ReadAllBytes(dlg.FileName); _emailService.AddAttachment(data, dlg.FileName); } } private void SetUserName(IAccount userInfo) { string userName = null; if (userInfo != null) { userName = userInfo.Username; } if (userName == null) { userName = "Not identified"; } UserName.Content = userName; } }

Running the application

When the application is started, the user can sign-in using the Sign in button.

The standard Azure AD login is used in a popup browser. Once the authentication is completed, the browser redirect sends the tokens back to the application.

If a file attachment needs to be sent, the Add Attachment button can be used. This opens up a dialog and any single file can be selected.

When the email is sent successfully, the email and the file can be viewed in the recipients inbox. The emails are also saved to the senders sent emails. This can be disabled if required.

Links

https://docs.microsoft.com/en-us/graph/outlook-send-mail-from-other-user

https://stackoverflow.com/questions/43795846/graph-api-daemon-app-with-user-consent

https://winsmarts.com/managed-identity-as-a-daemon-accessing-microsoft-graph-8d1bf87582b1

https://cmatskas.com/create-a-net-core-deamon-app-that-calls-msgraph-with-a-certificate/

https://docs.microsoft.com/en-us/graph/api/user-sendmail?view=graph-rest-1.0&tabs=http

https://docs.microsoft.com/en-us/answers/questions/43724/sending-emails-from-daemon-app-using-graph-api-on.html

https://stackoverflow.com/questions/56110910/sending-email-with-microsoft-graph-api-work-account

https://docs.microsoft.com/en-us/graph/sdks/choose-authentication-providers?tabs=CS#InteractiveProvider

https://converter.telerik.com/

Sunday, 08. August 2021

Moxy Tongue

Data Mob Rule

Individual Rights are hard to come by historically. Strong people make them possible. First requirement of their existence is thus, strong people. Weak and fearful people tend to group together for support initially, and then like crabs in a bucket, to prevent others from escaping the huddle of their security perception. For the purposes of "Individual Rights", this tribal, fearful, mob enforced c

Individual Rights are hard to come by historically. Strong people make them possible. First requirement of their existence is thus, strong people. Weak and fearful people tend to group together for support initially, and then like crabs in a bucket, to prevent others from escaping the huddle of their security perception. For the purposes of "Individual Rights", this tribal, fearful, mob enforced compliance and perceptual safety is an enemy of fulfilling the experience of those "Individual Rights". 

Individual liberty is the expression of "Individual Rights". A Sovereign Nation derived "of, by, for" people with "Individual Rights" defined broadly as "life, liberty and the pursuit of happiness", and given contextual authority as "unalienable" and "self-evident" in nature, must inherently emit from the Individual people that such a Sovereign Nation is composed of. This is the foundational reality of America. 

Conversely, mob rule is an expression of mob authority. When a "Bill Of Rights" sets out the 1st Right as the "freedom of speech", it is further evidence of the foundational intent of it's authors, all Individual people expressing "Individual Rights" in defining their own Sovereign authority. Therefore, when the Government derived "of, by, for" such people yields corporate charters, and under incorporated Sovereign State Law produces administrative methods of corporate employment, such that "freedom of speech" no longer belongs directly to people, Individuals all, but instead, administratively recasts a person as an employee with provisioned Rights of the State, something is askew. Many such "provisioned Rights" exist, but of greater consequence to the civil society resulting from the original founding intent and structural declarations with Constitutional authority, this administrative process enforces a new type of "mob rule" onto people, and their infrastructure supporting their "Rights" under Sovereign Law.

Mob rule can thus operate with 50.1% efficiency. A bicameral legislative body can be socially engineered into an expression of mob rule via a two party political system, whereby the people, Individuals all, no longer are the authorities in expressing their Sovereign Rights, except by replacement of new members who inherently are leveraged into joining existing mob rule parties. Independent people, Individual people with "Individual Rights" who serve their peers and civil Society are reduced to functionaries of the mob ruling political party system. All consensus is among the two parties, and the Institution of Governance is no longer directly representative of the best interests of the people, Individuals All. Instead, people reduce their participation to fit into the priorities of the two parties with mob rule authority in the bicameral legislative body.

Lookup "Republican" and "Democrat" in the text of the Constitution of the UNITED STATES OF AMERICA and Declaration of Independence for yourself. What "Rights" are conveyed "of, by, for" these political parties as representations of the people?


Mob Rule is the "Way We Go Down". Individual Rights is our victory, and salvation.

All of human history is reducible to the fight between mob rule and the Individual expression of human Rights. Mobs hate people. Let it set in.. mobs are afraid of people, Individuals all. Movies as esoteric as Footloose document repeatedly the fear the mob holds for Individuals expressing their freedoms. Mobs are the reduction of human intelligence, to the common baseline of fear, control, subservience, and domination of the human spirit within every Individual.

The fight for Individual Rights is one of control; how do Individuals control their own authority? The struggle against mob rule, against communist or socialist practices that squander human freedom in search for mob security, is one of eternal merit protecting that which makes humanity redeemable. Enslavement of humans by humans, written into the DNA of progress itself, manifesting destinies for all people, is only redeemable, forgivable, and teachable if the value it produces extends the power of Individual Rights to all people.

[Revised Appropriately]:

"We hold these truths to be self-evident, that all mankind is created equal and endowed by their Creator with certain unalienable Rights, that among these are life, liberty and the personal pursuit of happiness. Governments exist to guarantee the Rights of, by, and for such people, Individuals all, and no abstraction of literary, or mathematical form, shall in any way distort the meaning of such human authority by people, Individuals all."

America has a literary problem; "We the People" is a term of abstraction. "We the People" has been abstracted by legalese into a meaning that is out-of-step with observable reality. "We the People" are not a data aggregation algorithmic result, "We the People" are not a legal entity/State or one-off representation of authority. "We the people" are Individuals all, and together as people, Individuals all living among one another, our personal authority has constructed a Government that must work by the efforts of the people it is for as was declared, constituted and enforced by great personal action on the part of many people, Individuals all.

Degradation of people, Individuals all, to instruments of mob rule is out-of-step with the intent for the existence of the UNITED STATES OF AMERICA, and the use of political parties to induce cult-like mob authority in lieu of accountability to the actual structural integrity of people must be amended. The people must be defined accurately. As it was unacceptable to declare "All men equal.." and not include women or non-white people in that definition, so too is it inappropriate to cause "We the People" to be translated in any way other than may be observed by a child in nature. People exist as Individuals only, and any Union made more perfect must reflect this fact in its operational constraints, dependencies, protocols, policies, Laws and methods of enforcing human Rights.

America's greatest power is it's conceptualization of people with Individual Rights. Our "more perfect Union" is materialized only through the expression of these Individual Rights for all of it's people, as well as the opportunity for it's expression for all the people of planet Earth. Like a siphon pulling the greatest minds towards its gravitational center, to compete on a planet where American Rights for all exist is the greatest human opportunity to build upon human empowerment the world has ever known.

The American people, Individuals all, have enemies lined up against them. This is unlikely to change so long as fearful, greedy, an life-reducing bureaucrats are in a position to exercise their contrived authority resulting from practices of mob rule. All that any person need do to observe the reality of any "leader" extending any "policy" or practice affecting the lives of people is to interrogate the operation to find out how human power is distributed. If power is taken from people and given to a central power-processing entity to re-distribute to people defined differently than all others are by default, you are looking at a function of mob rule. If a person suggests that people, Individuals all, are not accountable for their own freedom, and can shirk their personal responsibilities to care for the babies, toddlers, young people, or elderly in need of human guardianship evoked from the sustaining love that gave them life, you are listening to an agent of mob rule.

Any act that induces mob rule is in support of mob rule.

Any act that sustains Individual Liberty is in support of Individual Rights.

Can you interpret reality?

Data is the dirt of digital existence; everything is made of data. Data is the carbon of digital existence; everything that matters to Humanity is made of data. Individual Rights are made of data, and when it comes to Individual Rights, structure yields results. Human authority is sourced directly from people, Individuals all. People are the source of Sovereign authority, or mob rule is being leveraged. It is that simple.

When designing a data structure to operate the Rights mechanisms of a Governing entity within a civil society enabled "of, by, for" people, Individuals all, both in bureaucratic Terms as well as in the processing of managing a Governance body composed of an Executive, Legislate and Judicial branch, the key constraint is the source of "useful identity". The useful identity is the primary key of both freedom and security for people within their own Governance bureaucracy where data is leveraged contextually of, by, for all making such a more perfect Union possible.

America's "Founding Fathers" gave the origin of this procession physical form, both for a King, as well as for every King's Administration System, or Government, that would ever exist henceforth. John Hancock made sure to write it large so no one would miss it. These people got it right the first time around, but an error of omission has caused the same procession to fail to keep repeating. People, Individuals all, have been denied the same opportunity afforded in the original declaration, and in the absence of a "recursive signatory event", one that repeats generation after generation, second class participants have been made of "We the People" as abstractions of legal literature.

You must originate your own Sovereign source authority via a recursive signatory made "of, by, for" your own Governance, as a person, an Individual. All of us, living among one another must be equal in this origin of personal authority, as our participatory rights are founded on the accurate definition of "We the People", Individuals all.

Women and men, equal people in the eyes of the law, Individuals all; No exceptions under American Law.

There is no alternative that yields an American citizen. If the next century is indeed a great global competition between America and China, let it be fought on the grounds of its inherent construct: 

People, Individuals All with Individual Rights

Vs

Authoritarian Enforced Mob Rule managed by fear and control of Individual Rights.

Internally, within any civil society claiming to represent people as freedom loving beings, pay attention to the practices used to enforce/empower the authority making Individual Rights possible. The past is made of many painful mistakes, mis-definitions, and missed opportunities, but today is for people, Individuals all, to get the story Right. 

Individual Rights come from people, Individuals All. There is no other possibility. 

ID:DATA is the primary key of both freedom and security in a civil society. People must own root authority over their ID:DATA. The structure of your data existence must match the inherent structure of your Individual Rights. Individual Rights come from people. Individual ID:DATA Rights come from people, Individuals all.

America comes from self-Sovereign human authority, defined generation after generation of, by, for those people. Expressing your Rights is a recursive act with an origin story. Make sure you own root. Mob rule is hunting your life, protect yourself. Protect the possibility of your more perfect Union.

Be Vigilant..


Friday, 06. August 2021

Moxy Tongue

Is A Computationally Accurate Constitution Possible?

Can people, as Individuals, serve as the source of their own Governance within a digitally designed infrastructure supporting the Constitutional Rights of people, such as American citizens represent? Is the data structure of default participation capable of supporting Universal Human Rights applied within a "Civil Society" defined by Laws and International Statements of Support instantiating Globa

Can people, as Individuals, serve as the source of their own Governance within a digitally designed infrastructure supporting the Constitutional Rights of people, such as American citizens represent? Is the data structure of default participation capable of supporting Universal Human Rights applied within a "Civil Society" defined by Laws and International Statements of Support instantiating Global Governing bodies, such as the United Nations represnts?


Yes?

(Click Here) Vote with action + comment (direct: https://twitter.com/NZN/status/1423704494475563009)


No? 

(Click Here) Vote with action + comment (direct: https://twitter.com/NZN/status/1423704515090522123)

Thursday, 05. August 2021

Habitat Chronicles

Living Worlds Considered Harmful

A Response to the Documentation of the Living Worlds Working Group1997-02-27 [A post by Douglas Crockford, recovered from the internet archive.] Introduction The Livings Worlds Initiative is the work »»

A Response to the Documentation of the Living Worlds Working Group
1997-02-27

[A post by Douglas Crockford, recovered from the internet archive.]

Introduction

The Livings Worlds Initiative is the work of a small but dedicated group of VRML developers who have a deep interest in extending VRML into the basis for interconnected virtual worlds. This project has been inextricably bound to a very effective public relations campaign and standards setting effort. The project is still in development, but is already being promoted as an industry standard for virtual worlds.

The Living Worlds Working Group has been signing up a large number of companies as supporters of the effort, including IBM and Apple. What is not clear to most observers is that support means nothing more than agreeing that standards are desirable.

Within the industry, there is common misunderstanding of what support for Living Worlds means, even among the supporters. The result is that support for a Livings Worlds Standard is starting to snowball. It is being regarded as a proposed standard, but it has not had to face any sort of rigorous review. The purpose of this response is to begin the belated review of the Living Worlds Documentation as a proposed standard.

Premature Standardization

There is a growing list of companies that are developing VRML-based virtual worlds. The sector is attracting a lot of attention. Even so, most of the social activity on the Internet today is in IRC and the chat rooms of AOL. The most successfully socialized avatar worlds are WorldsAway and The Palace, neither of which are VRML-based. The VRML worlds have seen a lot of churn, but are not creating significant sustaining communities.

The weakness of community formation in many of the VRML worlds may be the result of the newness of the worlds and the inexperience of the world designers, who have hampered themselves by putting 3D graphics ahead of socialization.

It is too early to be standardizing 3D social architecture. More experimentation is required. If the Living Worlds Initiative is an experiment conducted by a group of cooperating companies, then it is a good thing. If it is a standard-setting activity, then it is premature and harmful.

The operation of 3D worlds has not been shown to be a profitable activity. The business model is driven by affection for Neal Stephenson’s satiric cyberpunk novel, Snow Crash. Snow Crash describes a virtual world called the Metaverse. Some day, we may have a real Metaverse, and it might even be as important in our world as it was in the novel.

Living Worlds does not implement the Metaverse. It only makes something that looks like it, a meta-virtual world.

VRML itself is still new. VRML 2.0 was announced at Siggraph 96, and complete implementations are only now coming on line. The VRML 2.0 initiative was as frenzied as the Living Worlds Initiative, and because of the haste, the result was suboptimal. A consequence is that part of the Living Worlds Initiative contains some workarounds for VRML 2.0 limitations.

Security

The word “security” does not occur in the Living Worlds Documentation except to point out a security hole in VRML 2.0. The lack of attention to security by the Living Worlds Working Group is not a problem if the Initiative is viewed as an experiment. One of the benefits of the experiment will be to demonstrate why security is critical in the design of distributed systems. If the Living Worlds Initiative is setting a standard, then it is harmful.

Security is a very complicated and subtle subject. Absolute security is never achievable, but diligent design and application can produce systems which are worthy of trust.

The Living Worlds Documentation identifies three issues in which distributed security is critical.

handle everything via dynamically downloaded Java applets protect the local scene from damage by imported objects support authentication certificates (dice, business cards)

The Documentation does not adequately address any of those issues.

Lacking security at the lowest levels, Living Worlds is not strong enough to offer a credible trust model at the human-interaction level. In systems which can be hacked, concepts like identity, credentials, and accountability are meaningless.

This severely limits the application scope of Living Worlds. Environments which permit interpersonal commerce or confidential collaboration should not be implemented in Living Worlds.

Secure software distribution

Software is the lifeblood of virtual communities. The value and diversity of these systems depend on the ability to dynamically load and execute software from the network. Unfortunately, this raises a huge security concern. Software from the network could contain viruses, trojan horses, and other security threats. Because of the dynamic and interconnected nature of virtual communities, the protection mechanisms provided by Java are not adequate.

The Living Worlds Documentation notes that

…at present, most systems prohibit Java from accessing local files, which makes it impossible, for example, to connect to locally installed third party software features. Until this problem is generically solved by the Java community, the management of downloads and local access are left to proprietary MUtech solutions.

The proprietary MUtech solutions will create a security hole, and possibly compromise the goal of interoperability at the same time. In order for the dynamic, distributed virtual community to be viable, the issue of secure software distribution must be solved from the outset. Class signing is not a solution. A secure, distributed architecture is required. It is doubtful that credible security mechanisms can be retrofitted later.

Protect the local scene

Related to the problem of software distribution is the question of rights granted to objects. Objects that are valued in some places might be obnoxious or dangerous in others. The Living Worlds Documentation describes an incontinent puppy as an example of such an object. A secure architecture needs to deal with these sorts of issues from the outset. The Living Worlds Documentation identifies the problem, but does not solve it.

Authentication

The Living Worlds Documentation calls for the use of authentication certificates as a mechanism for assuring confidence in certain objects. Unfortunately, if the architecture is not secure, there is not a reliable context in which certificates can be trusted. Because Living Worlds is hackable, certified objects can be compromised.

Community

Communities need tools with which they can create policies to regulate the social order. Different communities will have different needs, conventions, standards. The Living Worlds Documentation says this about the task of designing those tools:

Two things seem clear. First, that designing a persuasively complete infrastructure for managing user rights, roles and rules is an essentially open-ended task. Second that building a simple, open-ended framework for the same domain can probably be completed with very little effort.

Unfortunately, the Working Group does not adequately understand the issues involved. They will create a tool based on a limited understanding of the problem, attempt to drive it into a standard, and leave to others the social and administrative headaches it will cause.

This general strategy applies to the rest of the Living Worlds effort as well. Our goal is to reach quick consensus on a minimal subset, and then to encourage the rapid creation of several reference implementations of that proposed feature set. Refinement of the concepts can then proceed in an atmosphere usefully disciplined by implementation experience.

Problems of this kind cannot be solved by refinement.

Incomplete

If the Living Worlds Documentation were just the work in progress of a working group, then it is appropriate that they publish their materials on the net for comment by interested parties, and it would be absurd to point out that the work is unfinished. But because it is also being presented publicly as a networking standard, and because the Living Worlds Working Group has already begun the work of standard setting, the Documentation needs to be tested for its fitness as a standard.

If the Living Worlds Documentation is read as a proposed standard, then it should be rejected out-of-hand, simply because it is incomplete. In its present form, the Living Worlds Documentation is not even complete enough to criticize.

Principles

The Living Worlds Working Group selected a set of principles to guide the development process. Membership in the working group is open to anyone who can accept the principles. This is a reasonable way for a working group to define itself. Unfortunately, the principles of the Working Group are problematic for a standards body. While the Living World Documentation is not complete enough to criticize, the principles and basic architecture can be criticized.

Build on VRML 2.0.Use VRML 2.0 would have been a better first principle. By Building on VRML 2.0, the Working Group is hoping to work some or all of the Livings Worlds work into the VRML 3.0 standard, thereby increasing the importance of the Living Worlds Standard.This component-oriented principle led the Working Group to put the display processor in the center of a distributed architecture, ignoring decades of experience in the separation of I/O from other computational goals.Fortunately, the recent moderating influence of the Mitsubishi Electric Research Laboratory (MERL) has opened the Living Worlds Working Group to the possibility of other presentation models. Unfortunately, the Living Worlds Architecture is already fixed on a set of unwieldy interfaces which were motivated by a VRML-centric design space. Standards, not designs.The second principle is intended to give implementers a large amount of leeway in realizing the standard. The amount of leeway is so great that it might not be possible for independent implementations to interoperate with implementations developed by the Working Group. Since that is specifically what a standard is supposed to accomplish, the second principle is self-defeating.The other benefit of the second principle to the Working Group is to provide an expedient method of dealing with disputes. When the members of the Working Group do not agree on an architectural point, they agree to disagree and leave the choice to the implementer. Sometimes the reason they do not agree is because they were confronting an essential, hard problem. Architectural Agnosticism.The third principle concerns the question of centralized (server-based) or decentralized (distributed) architecture. Centralized social networking systems often suffer from performance problems because the server can become a bottleneck. The Working Group therefore wants to keep the option of decentralization open: A centralized architecture cannot be transformed into a decentralized architecture simply by being vague about the role of the server. Decentralized design requires the solution of a large number of hard problems, particularly problems of security. An insecure architecture can facilitate the victimization of avatar owners by miscreants. Insecurity will also limit the architecture to supporting only limited interactions, such as chat. Higher value services like cooperative work and interpersonal commerce require a secure platform. Such a platform is not a goal or result of the Living Worlds Initiative.Because the third principle does not explicitly call for the solution to the problems of secure decentralization, it is self-defeating, resulting in an implementations which are either insecure or devoutly centralist or both. Respect the role of the market.In the fourth principle, the Working Group chooses this unfortunate non-goal: The process does not pay adequate attention to the consequences of the design. The goal of the Working Group is to establish a standard early, relying on iteration in the maintenance of the standard to make it, if not the best imaginable, then good enough for commercial use. The process is not forward-looking enough to provide confidence that the standard can be corrected later. Significant architectural features, such as security, are extremely difficult to insert compatibly into existing systems. Require running code.The fifth principle appears to be the most respectable, but when coupled with the urgency and recklessness of the fourth principle, it becomes the most dangerous.A standards development process that requires demonstration of new techniques before incorporating them into the standard can be a very good thing because it provides assurance that the standard is implementable. It can also provide early evidence of the utility of the new techniques.But if such a process is driven by extreme time pressure, as the Living Worlds Working Group is, then the fifth principle has a terrible result: only ideas with quick and dirty implementations can be considered. The Working Group will finish its work before hard problems can be understood and real solutions can be produced.So, by principle, the Working Group is open, but not to good ideas that will require time and effort to realize. Conclusion

The software industry sometimes observes that its problems are due to not having standards, or to having too many standards. Often, its problems are due to having standards that are not good enough.

Premature standardization in the area of virtual worlds will not assure success.

The Living Worlds Initiative is a model for cooperative research, and as such it should be encouraged. The Working Group is using the net to create a virtual community of software developers working together on a common project. This is very good.

Unfortunately, the Living Worlds Initiative is also a standards-setting initiative, building on the momentum of the recent VRML 2.0 standard. It would be harmful to adopt the Living Worlds Initiative as a standard at this time.


Jon Udell

The Tao of Unicode Sparklines

I’ve long been enamored of the sparkline, a graphical device which its inventor Edward Tufte defines thusly: A sparkline is a small intense, simple, word-sized graphic with typographic resolution. Sparklines mean that graphics are no longer cartoonish special occasions with captions and boxes, but rather sparkline graphics can be everywhere a word or number can … Continue reading The Tao of Unicode

I’ve long been enamored of the sparkline, a graphical device which its inventor Edward Tufte defines thusly:

A sparkline is a small intense, simple, word-sized graphic with typographic resolution. Sparklines mean that graphics are no longer cartoonish special occasions with captions and boxes, but rather sparkline graphics can be everywhere a word or number can be: embedded in a sentence, table, headline, map, spreadsheet, graphic.

Nowadays you can create sparklines in many tools including Excel and Google Sheets, both of which can use the technique to pack a summary of a series of numbers into a single cell. By stacking such cells vertically you can create views that compress vast amounts of information.

In A virtuous cycle for analytics I noted that we often use Metabase to display tables and charts based on extracts from our Postgres warehouse. I really wanted to use sparklines to summarize views of activity over time, but that isn’t yet an option in Metabase.

When Metabase is connected to Postgres, though, you can write Metabase questions that can not only call built-in Postgres functions but can also call user-defined functions. Can such a function accept an array of numbers and return a sparkline for display in the Metabase table viewer? Yes, if you use Unicode characters to represent the variable-height bars of a sparkline.

There’s a page at rosettacode.org devoted to Unicode sparklines based on this sequence of eight characters:

U+2581 ▁ LOWER ONE EIGHTH BLOCK U+2582 ▂ LOWER ONE QUARTER BLOCK U+2583 ▃ LOWER THREE EIGHTHS BLOCK U+2584 ▄ LOWER HALF BLOCK U+2585 ▅ LOWER FIVE EIGHTHS BLOCK U+2586 ▆ LOWER THREE QUARTERS BLOCK U+2587 ▇ LOWER SEVEN EIGHTHS BLOCK U+2588 █ FULL BLOCK

Notice that 2581, 2582, and 2588 are narrower than the rest. I’ll come back to that at the end.

If you combine them into a string of eight characters you get this result:

▁▂▃▄▅▆▇█

Notice that the fourth and eight characters in the sequence drop below the baseline. I’ll come back to that at the end too.

These characters can be used to define eight buckets into which numbers in a series can be quantized. Here are some examples from the rosettacode.org page:

“1 2 3 4 5 6 7 8 7 6 5 4 3 2 1” -> ▁▂▃▄▅▆▇█▇▆▅▄▃▂▁
“1.5, 0.5 3.5, 2.5 5.5, 4.5 7.5, 6.5” -> ▂▁▄▃▆▅█▇
“0, 1, 19, 20” -> ▁▁██
“0, 999, 4000, 4999, 7000, 7999” -> ▁▁▅▅██

To write a Postgres function that would do this, I started with the Python example from rosettacode.org:

bar = '▁▂▃▄▅▆▇█' barcount = len(bar) def sparkline(numbers): mn, mx = min(numbers), max(numbers) extent = mx - mn sparkline = ''.join(bar[min([barcount - 1, int((n - mn) / extent * barcount)])] for n in numbers) return mn, mx, sparkline

While testing it I happened to try an unchanging sequence, [3, 3, 3, 3], which fails with a divide-by-zero error. In order to address that, and to unpack the algorithm a bit for readability, I arrived at this Postgres function:

create function sparkline(numbers bigint[]) returns text as $$ def bar_index(num, _min, barcount, extent): index = min([barcount - 1, int( (num - _min) / extent * bar_count)]) return index bars = '\u2581\u2582\u2583\u2584\u2585\u2586\u2587\u2588' _min, _max = min(numbers), max(numbers) extent = _max - _min if extent == 0: # avoid divide by zero if all numbers are equal extent = 1 bar_count = len(bars) sparkline = '' for num in numbers: index = bar_index(num, _min, bar_count, extent) sparkline = sparkline + bars[index] return sparkline $$ language plpython3u;

Here’s a psql invocation of the function:

analytics=# select sparkline(array[1, 2, 3, 4, 5, 6, 7, 8, 7, 6, 5, 4, 3, 2, 1]); sparkline ----------------- ▁▂▃▄▅▆▇█▇▆▅▄▃▂▁ (1 row)

And here’s an example based on actual data:

Each row represents a university course in which students and teachers are annotating the course readings. Each bar represents a week’s worth of activity. Their heights are not comparable from row to row; some courses do a lot of annotating and some not so much; each sparkline reports relative variation from week to week; the sum and weekly max columns report absolute numbers.

This visualization makes it easy to see that annotation was occasional in some courses and continuous in others. And when you scroll, the temporal axis comes alive; try scrolling this view to see what I mean.

We use the same mechanism at three different scales. One set of sparklines reports daily activity for students in courses; another rolls those up to weekly activity for courses at a school; still another rolls all those up to weekly activity for each school in the system.

At the level of individual courses, the per-student sparkline views can show patterns of interaction. In the left example here, vertical bands of activity reflect students annotating for particular assignments. In the right example there may be a trace of such temporal alignment but activity is less synchronized and more continuous.

When we’re working in Metabase we can use its handy mini bar charts to contextualize the row-wise sums.

The sparkline-like mini bar chart shows a row’s sum relative to the max for the column. Here we can see that a course with 3,758 notes has about 1/4 the number of notes as the most note-heavy course at the school.

Because these Unicode sparklines are just strings of text in columns of SQL or HTML tables, they can participate in sort operations. In our case we can sort on all columns including ones not shown here: instructor name, course start date, number of students. But the default is to sort by the sparkline column which, because it encodes time, orders courses by the start of annotation activity.

The visual effect is admittedly crude, but it’s a good way to show certain kinds of variation. And it’s nicely portable. A Unicode sparkline looks the same in a psql console, an HTML table, or a tweet. The function will work in any database that can run it, using Python or another of the languages demoed at rosettacode.org. For example, I revisited the Workbench workflow described in A beautiful power tool to scrape, clean, and combine data and added a tab for Lake levels.

When I did that, though, the effect was even cruder than what I’ve been seeing in my own work.

In our scenarios, with longer strings of characters, the differences average out and things align pretty well; the below-the-baseline effect has been annoying but not a deal breaker. But the width variation in this example does feel like a deal breaker.

What if we omit the problematic characters U+2581 (too narrow) and U+2584/U+2588 (below baseline and too narrow)?

There are only 5 buckets into which to quantize numbers, and their heights aren’t evenly distributed. But for the intended purpose — to show patterns of variation — I think it’s sufficient in this case. I tried swapping the 5-bucket method into the function that creates sparklines for our dashboards but I don’t think I’ll switch. The loss of vertical resolution makes our longer sparklines less useful, and the width variation is barely noticeable.

Unicode evolves, of course, so maybe there will someday be a sequence of characters that’s friendlier to sparklines. Maybe there already is? If so please let me know, I’d love to use it.


1 https://blog.jonudell.net/2021/07/21/a-virtuous-cycle-for-analytics/
2 https://blog.jonudell.net/2021/07/24/pl-pgsql-versus-pl-python-heres-why-im-using-both-to-write-postgres-functions/
3 https://blog.jonudell.net/2021/07/27/working-with-postgres-types/
4 https://blog.jonudell.net/2021/08/05/the-tao-of-unicode-sparklines/
5 https://blog.jonudell.net/2021/08/13/pl-python-metaprogramming/
6 https://blog.jonudell.net/2021/08/15/postgres-and-json-finding-document-hotspots-part-1/
7 https://blog.jonudell.net/2021/08/19/postgres-set-returning-functions-that-self-memoize-as-materialized-views/
8 https://blog.jonudell.net/2021/08/21/postgres-functional-style/
9 https://blog.jonudell.net/2021/08/26/working-in-a-hybrid-metabase-postgres-code-base/
10 https://blog.jonudell.net/2021/08/28/working-with-interdependent-postgres-functions-and-materialized-views/
11 https://blog.jonudell.net/2021/09/05/metabase-as-a-lightweight-app-server/
12 https://blog.jonudell.net/2021/09/07/the-postgres-repl/


Werdmüller on Medium

The Hot Van Summer

Taking one last road trip with my mother Continue reading on Medium »

Taking one last road trip with my mother

Continue reading on Medium »

Wednesday, 04. August 2021

MyDigitalFootprint

#Identity. Are we (the industry) the problem?

How many people do you need before identity has value - two! How come, as an industry where 3.2bn (McKinsey) people have a digital identity, are we so fragmented, uncoordinated and disagreeable?  It is evident that our ongoing discussions about identity, ethics, bias, privacy and consent revolve around a lot of noise (opinions) but little signal (alignment), but why?  Recognising that
How many people do you need before identity has value - two! How come, as an industry where 3.2bn (McKinsey) people have a digital identity, are we so fragmented, uncoordinated and disagreeable? 

It is evident that our ongoing discussions about identity, ethics, bias, privacy and consent revolve around a lot of noise (opinions) but little signal (alignment), but why?  Recognising that in 30 years of digital identity, we still lack coherent and coordinated action to make it work for everyone is a reality. Perhaps it is time to recognise that it is “us”, the industry, who are the problem.  We continue in our self confirming opinions, righteous products and determination to win at all costs.  I am not saying we have not made progress or done amazing things, but we have not done as well as we should have!

As identity now takes several forms insomuch that it emerges from interactions with a system (say payment & reputation), is foundational (given by an authority, a credential), and we can create our own (Self Soverign Identity [SSI]), we are more divided than ever on what we should build on, but have we got it totally wrong. This article gives some New thinking to rattle us. 

We (anyone in identity, privacy, consent, digital space) are unlikely to like reading this. I am aware the responses are unlikely to be favourable, positive or constructive; because this is hard. The learning and takeaway, in the end, draws from a different market, who have solved how to act as one, even though they did not align.

-----  

I am first unpacking the current status.

What happens when the user (human) is not the connector?

We live in a digital age that is utterly dependent on the complex connectivity of systems, and in all but one case, the integrated communication path requires the centricity to be on a company, government body, product or service; except for personal identity, privacy and consent that remain focussed on an individual human. 

Identity (all forms) is a function of context based on relationships. Identity exists between the two poles of emergent ID (from system and interactions), functional ID (a credential given to us by an authority) and with an individual (in the case of personal identity) being the connecting point. The dependencies between these solutions mean that our “identity data” rests in government, organisations, companies and individual domains.

Identity in this context

Identity includes everything that you can imagine: who you think you are, passport, ID, login, email, IP address, messages, census data, birth record, photos, friends, education, work, conversations, experience, travel, reputation, scars, memories, health, heart rate, blood pressure, medication, family tree, skin, DNA, parents, heritage, analysis, habits, rules, content, culture, who you identify as - everything yesterday, today and tomorrow that identify you as you and that makes you unique.  

Functional or Foundational Identity

I am keeping it broad & straightforward. Passport, birth certification, diving licence, DNA, sovereign ID, company ID and anything given to you by someone who has command and control (a credential.)  Foundational Identity has governance and oversight about who issues the foundational identity, and therefore they are classed as an "authoritative source." 

Foundational “hard” ID is given, but from it rises many softer identities (identifications) that are all built from having a foundation.  If we had an unbreakable lineage and provenance to back identification from identity, it might help; but there are unintended consequences we have to face. 

Pause and Reflection

We know something is broken in the functional ID model as my identity is me, who I am, what drives me, experience; it is not my passport but where the passport took me and the experiences I gained there.  We know some don’t have the foundations of identification, but they have an identity.  Your foundation identity does not define who you are.

Emergence 

Definition: Emergence is where the properties or behaviours only emerge when components or parts interact as part of a whole inclusive system. 

Emergence Identity

An identity that forms because you interact with systems.  Your made-up email address enables someone to know where you live, what you buy, when you shop, who you shop for.  Your preferences and actions identify you.  This identity emerges not because you have a trusted bank account to pay but because you interact, your reputation.  Your passport allows and enables interactions.  The more we interact in a complex digital world, the more emergent identities we have. 

Emergence is usually hidden insofar that IP address providers or payment transaction analysis systems could be judged as emergent identity creators. We as individuals do not control them or have visibility of the data (dark/ hidden). It is very much a debate if consent is needed and/ or if the creator (the user) should have this data under their control.  Open Banking could be seen as one of the first emergent identity systems that do give the user some control.

If emergent identity had strong governance and oversight, it might also become like a Functional Identity.

We have to accept our identity is more than the sum.

The diagram below is based on a polarity model.  Unlike a linear or value chain model, there is no one universal starting point or final destination.  As we discover the value and limitations of the two principle identity models, we recognise that we continually move around the model, never settling at one place or outcome but learning and improving both.  However, given the boundaries and limitations of each, we need both to be successful.  

The purpose of functional and emergence ID is to determine where you start the journey.


There is a deep dependency. 

Without foundational, I cannot access emergent; without emergent, I cannot gain value from foundational. If I cannot gain/ get a foundational identity, emergent identity enables me to start, barter/ trade and protect myself.  Emergence can be a lifeline to getting a foundational identity. 

Companies, organisations and governments believe that control of your identity is critically important. They call it identity to ensure you are not aware if it is emergent or foundational.  They (well, we the industry) will use one to create the other and cause confusion.  

Where does SSI (self-sovereign identity) fit?

Ignoring the name and all those issues, SSI is not emergent ID or functional/ foundational ID, but our agent for identity. SSI is a decentralized model for digital identity based on an entity’s control over forming and using cryptographically secured digital connections (using DIDs) and accepting and presenting digitally signed credentials (VCs). It applies as much to organizations and IoT as individuals. (thanks @DrummondReed)

SSI has made me realise that we have created different onboarding positions and propositions, not different approaches to identity.  

SSI could be seen as a new model for providing functional ID that has interesting possibilities for helping with some aspects of emergent ID. But no technology can fully emulate the qualities of emergent ID.

But what does this mean?

There is a deep desire to want the best of all worlds without realising or recognising the separation and interdependence. We debate and argue about transparency, data sharing, ontology requirements, privacy by design, consent driven, control, value, power, and much more in wanting the best.  

Given we cannot have a form of identity without the other, how can we cycle faster to co-create better solutions with the best aspects of both? But recognise that consent works well for foundational identity but is hopeless for emergent. Privacy can protect foundational ID, but not emergent.  

The linkage is us, the human, today, but should we be the link?

Right now, it is the individual who connects the different domains of personal identity.  Without you and your needs to trade, barter, work, play - these domains would not need connecting.  We create and need connections and are the connector.  We may be lucky enough to have a foundational ID at birth, but for a digital life, we need both.  We are the connection until death when they will slowly separate again as emergent ID is no longer needed, but a link can continue.   

Mapping some solutions

Below is the polarity map of identity with SSI, SOLID, yoti, and Vouch added. Why have I only picked these from the 100+ solutions, because I want others to place where they are.  The placements are the starting positions for onboarding, the value proposition to get onto the identify loop.  It is not where they stay or add the most value; it is day zero, what problem is solved and why you should join. I am likely to have in-correctly positioned them, so if you feel they are wrong, please identify a better place.  The message is that we all start as different places to solve the same problem.  But what we do is continually confuse everyone and ensure that we disagree on what identity means or what it is for.  We hold fast to our “best in class” solution, our product and our personal beliefs with conformational bias. 

Who has solved the same problem before?

An insight for us (the identity industry) comes from an unlikely source. Time to look at the green movement, global warming and climate change.  Back in the 1960s, when protests against nuclear power and arms were in full swing, we also marched for: save the whales, save the endangered species, save the newt, save the planet, save humanity, save the tigers, save the elephants, save the orangutans and many more.   On a march or protest, there were a few to hundred and the occasional thousand.  Co-ordinated marches across the globe in different cities swelled the number on a specific topic march.  During the 80’s “Feed the World” raised millions of pounds for a cause, famine in Africa.  When acting as one - we can start to create change.

At some point, something interesting happened.  Individuals connected their ecosystem crisis to another, more significant, crisis.  It was bigger than a connection; it was a cause.  The small marches against an effect could start to see there was a more significant cause.  Global warming and climate change bought together those focusing on local protests through loss of habitat to match together as one voice as their loss resulted from a cause. Save the Whales would not join a save the newt but can walk side by side on Climate Change.  Climate change was not a march but a “co-opted” group from thousands of different factions fighting for their own passion, purpose or love.  Many voices became one and retained their own passion. 

Identity has many factions, many voices, many causes, many definitions and many solutions.  We continue to define, redefine and demand our “xxx” of identity as the one usually because it is the model we want as we will benefit. 

Our big and unifying issue is to recognise that corporates, especially big tech and government, have no interest in one identity and indeed would find ways to ensure it remains factions and uncoordinated unless they are the single controller. They will sponsor and support many programs, many solutions and keep the divisions going, playing a game. Single identity would likely also render Facebook and media companies, with their advertising model unworkable - so not all big tech want a solution! Division works.

Perhaps the lesson here is that we should stop fighting for our own version of identity (privacy, ethics, consent), for our own definitions, for our own implementations, for our own solutions, for our own products and start to think about acting like one. Each of us swallows some pride, eats some ego and finds ways to behave as one industry voice, one team who cares together as we have passion, purpose and heart to solve identity problems for everyone.

The railway, automobile and aero industries across the world have made it work. Building different gauge infrastructures to different standards but eventually learning that together they thrive when they stopped wanting aligned infrastructures and found a way to ensure users could access and use it all. 

If you disagree, share this and say why. If you agree, share this, so we talk about it at our meeting places.  If you're an influencer, reflect to see if we are the problem. If you organise events, why not make time to lead this debate?  


----

This article has been in formation for some 18 months. It had no direction or lacked sense at points, and through perseverance, friends, and discussion, it has become what it is today.  Many have shaped the thinking.









Tuesday, 03. August 2021

Phil Windley's Technometria

Smart Property

Summary: Smart property is much more than the anemic connected things we have now. Smart property imagines a world where every thing participates in digital communities and ecosystems, working through programmable agents under the owners control. I just listened to this excellent podcast by Vinay Gupta about what he calls "smart property." Vinay released a book last year call The Fu

Summary: Smart property is much more than the anemic connected things we have now. Smart property imagines a world where every thing participates in digital communities and ecosystems, working through programmable agents under the owners control.

I just listened to this excellent podcast by Vinay Gupta about what he calls "smart property." Vinay released a book last year call The Future of Stuff that covers this topic in more detail.

The Future of Stuff by Vinay Gupta

Where and who do we want to be? How might we get there? What might happen if we stay on our current course? The Future of Stuff asks what kind of world will we live in when every item of property has a digital trace, when nothing can be lost and everything has a story. Will property and ownership become as fluid as film is today: summoned on demand, dismissed with a swipe? What will this mean for how we buy, rent, share and dispose of stuff? About what our stuff says about us? And how will this impact on us, on manufacturing and supply, and on the planet?

The idea is similar to what Bruce Sterling has called Spimes and what we've been building on top of picos for over a decade.

Smart property goes well beyond the internet of things and connected devices to imagine a world where every thing is not just online, but has a digital history and can interact with other smart things to accomplish whatever goals their owners desire. Things are members of communities and ecosystems, working through programmable agents.

A world of smart things is decentralized–it has to be. While Vinay talks of blockchains and smart contracts, I work on picos. Likely both, or some version of them, are necessary to achieve the end goal of a sustainable internet of things to replace the anemic, unsustainable CompuServe of Things we're being offered now.

Going Further

We've built a platform on top of picos for managing things called Manifold. This is a successor to SquareTag, if you've been following along. You can use Manifold to create spimes or digital twins for your things. Some of the built-in applications allow you to find lost stuff using QR code stickers or write notes about your things using a journaling app. You could build others since the application is meant to be programmable and extendable. I primarily use it as a personal inventory system.

Photo Credit: Spimes Not Things. Creating A Design Manifesto For A Sustainable Internet of Things from Michael Stead, Paul Coulton & Joseph Lindley (Fair Use)

Tags: ssiot iot picos


MyDigitalFootprint

CDO Day 0 - The many faces of evidence

Click here to access all the articles and archive for </Hello CDO> A very modern problem for leadership, executives and boards is, in fact, a very old problem; it is just that the scale, impact and consequences have increased.  Evidence.   There are two sides to the evidence coin, the proof the problem is the priority problem for scarce resources, and the other side is th
Click here to access all the articles and archive for </Hello CDO>

A very modern problem for leadership, executives and boards is, in fact, a very old problem; it is just that the scale, impact and consequences have increased.  Evidence.  

There are two sides to the evidence coin, the proof the problem is the priority problem for scarce resources, and the other side is that that solution is the one that will drive the outcomes we desire.  Very different evidence requirements, and each side have many faces.

Complexity and uncertainty mean that the historical preference for gut feeling and big leadership power plays provide insufficient rigour in the decision-making process. However, evidence gathering and presentation have similar issues.   Evidence can always be found to support the outcome you want (is this a problem or not), and incentives mean that evidence about which solution creates the best outcome will be lost, reframed and distorted. 

Unpicking these two sides and many faces of evidence is a Day 0 challenge for all CDO’s. 

In previous articles, I have written about many aspects of this problem.

Do I start to unpack this is a problem and challenge it?

Do I unpack this recommended solution as evidence is biased

Do I want to be the new corporate punch bag?

Do I leave the past to be the past and only focus on the new, knowing the past will influence the future?

How do I raise the evidence bar knowing that facts and science will not help bring the team together?

Do I want to add a new CORE responsibility to an already packed list?

Are we sure what we are optimising for?

Have I got the time to act as the translator and interrupter?

The core to evidence is knowing. 

How to know that what you know is true is epistemology. Epistemology doesn’t just ask questions about what we should do to find things out; that is the task of all disciplines to some extent. For example, science, research, data analysis, and anthropology all have their own methods for finding things out. Epistemology has the job of making those methods themselves the objects of study. It aims to understand how methods of inquiry can be seen as rational endeavours.

If the concept of “knowing things to be true” were simple, we’d all agree on many things that we currently disagree about; climate and vaccines appear current.  We cannot reach a unanimous agreement means something is wrong with our data, analysis, and information model.

The senior leadership team will tend to think of themselves as clear thinkers and see those who disagree with an opinion as misguided. Executives believe they have the capacity to see things just as they “really” are and that it is others who have confused perceptions. As a result, the CDO can simply point out to everyone where their thinking has gone wrong using facts rather than engage in rational dialogue allowing for the possibility that we all might be wrong. Philosophy, psychology and cognitive science guide us that we are complex, and our reasoning is not clinical or pure. Data (facts) and its analysis create a staggeringly complex array of cognitive biases and dispositions. We are generally ignorant of their role in others who influence us and our own thinking and decision-making.

A CDO is responsible for introducing everyone who touches data to a systematic way of interrogating thinking, models, rationality, and a sense of what makes “for a good reason” for both sides of the evidence coin. 

As a CDO, you should help determine by what criteria the leadership team are to evaluate reason.  We need to establish how, as a team, the criteria themselves are evaluated? Further, how do we determine how a belief or action can be justified and the relationship between justification and truth?   And as the CDO, we have to establish and own the company-wide top-level data ontology


The first action will be to get the topic of evidence on the agenda, not as a diversion, but to start to bring up critical issues such as bias in our decision making. 



----

Whilst our ongoing agile iteration into information beings is never-ending, there are the first 100 days in the new role. But what to focus on? Well, that rose-tinted period of conflicting priorities is what </Hello, CDO!> is all about. Maintaining sanity when all else has been lost to untested data assumptions is a different problem entirely.


Monday, 02. August 2021

Doc Searls Weblog

Car design trends

On Quora, here’s my answer to What are the worst design trends in modern cars?—updated by our family’s experience with a new Toyota that features even more indicators than the bunch above:::: Based on driving lots of late-model rental cars, here’s a list: Entertainment systems that are hard to use and dangerous on the road. (Few are […]

On Quora, here’s my answer to What are the worst design trends in modern cars?—updated by our family’s experience with a new Toyota that features even more indicators than the bunch above::::

Based on driving lots of late-model rental cars, here’s a list:

Entertainment systems that are hard to use and dangerous on the road. (Few are good. Most are bad. Some are truly awful.) Making AM and FM listening harder than ever. Some of this is by putting too many functions in too many menus you have to poke at. (While driving, knobs and switches beat buttons for usability. Ask a pilot.) Some of it is by burying antennas in windows, which will never work as well as a whip antenna (preferably the retractable kind that can survive a car wash). But a thumbs-up to cars offering HD Radio, which adds many more stations to FM and far better sound to AM (on the sadly too few stations that feature it). Way too much optionality among features with non-obvious meanings that you control through buttons with whaaa? symbols or half-buried menus that can be as dangerous to navigate while driving as it is to finger-text on one’s cell phone. For example, a loaded 2021 Toyota Camry Hybrid has TSS w/PD, HUD, DRCC, LDA w/SA, LTA, AHB, PCS, RCD, RSA, BSM. RCTA, RCTB, VSC, EV, ECO, plus other stuff that’s a bit more spelled out, such as TRAC, Qi Wireless Charger and Birds Eye View Camera. And those are in addition to the usual indicators: shifter position, odometer, outside temperature, etc. Many of these unclear functions are displayed only or mainly in the “Meters/Multi-Information Display” you view through or over your steering wheel. Since there is no way the display can give you a full view of all these functions in all their possible states, you move around your selections and menus through buttons on the steering wheel that you mash with your left thumb. And that’s just one model of one car. (Which we happen to almost have: ours is a 2020 model.) Poor visibility out the back corners, thanks to extra-wide roof pillars and fake-muscle styling that narrows the shapes of the cars’ aft windows. No place to mount a phone. I mean, why have Apple’s CarPlay and/or Android Auto and not have a place to mount a phone? (Yes, there are aftermarket things with suction cups, but most new cars lack a surface other than the windshield that will hold a cup sucked.) Trunks with plenty of space but too small an opening, so it’s hard to get large or odd-shaped items in there. Low-profile and performance tires, which handle nicely but can ride rough and transmit lots of road noise. Too much black. On the dashboard platform under windshields, black makes sense because you don’t want a light color reflecting off the windshield. But black is used way too much in trim. Worse, black steering wheels parked in the sun can get too hot to hold. And black leather or vinyl seats can fry your ass. Giant grills—especially ones that resemble the mouths of manta rays. (I’m looking at you, Lexus.) The tendency of headlight lenses to develop cataracts. My ’05 Subaru has them. My daughter’s newer Honda Civic has worse ones. Could be newer models don’t do that, but it’s actually dangerous and needs to be gone.

Comments still don’t work here, so instead tweet about it or write me directly: first name at last name dot com.


Just a Theory

Sign O’ The Times

I started a new gig last week, after ten rewarding years at the old job. Pretty stoked.

Some news: I’m super happy to report that I started a new job last week at The New York Times.

After ten years at iovation, the last four working remotely from New York City and the last three under the ownership of TransUnion, I felt it was time to find something new. At The Times, I’ve taken the role of Staff Engineer on a new team, User Systems. I’m particularly stoked for this gig, as it falls right into areas of abiding interest, including privacy-by design, personal data protection, encryption, authentication, credential management, and scaling a vital app for the whole business. Add that to the straightforward commute once the office reopens, and it’s hard to find something more ideal.

I truly appreciate the extraordinary experience of my ten years at iovation. I originally thought I’d stay a couple years, but was so engaged by the people and the great work we did that I kept at it. I learned a ton about product engineering, product design, and scalable architectures, but especially about working with terrific colleagues who made me a better person even as I tried to be of service to them. I will especially miss working with Scott, Kurk, Clara, Travis, John, and Eric — and countless others. I wish them all the best, and would enjoy working with any and all of them again anytime.

Now I’m excited to make new connections working with my amazing new colleagues at The Times. I expect we’ll collaborate on fulfilling work building super useful tools that advance The Times mission to inform and empower its readers. I’m delighted to be jumping on this ride with them.

More about… Personal Work New York Times

Damien Bod

Securing an Angular app which uses multiple identity providers

Sometimes Angular applications are required to authenticate against multiple identity providers. This blog post shows how to implement an Angular SPA which authenticates using Auth0 for one identity provider and also IdentityServer4 from Duende software as the second. The SPA can logout from both of the identity providers individually and also revoke the refresh token […]

Sometimes Angular applications are required to authenticate against multiple identity providers. This blog post shows how to implement an Angular SPA which authenticates using Auth0 for one identity provider and also IdentityServer4 from Duende software as the second. The SPA can logout from both of the identity providers individually and also revoke the refresh token used to renew the session using the revocation endpoint. The endsession endpoint is used to logout. IdentityServer4 also supports introspection so that it is possible to revoke reference tokens on a logout as well as the refresh token. The Angular application uses OpenID Connect code flow with PKCE implemented using the npm angular-auth-oidc-client library.

Code: https://github.com/damienbod/angular-auth-oidc-client/tree/main/projects/sample-code-flow-multi-Auth0-ID4

The angular-auth-oidc-client npm package can be added to the application in the App.Module class. We used an AuthConfigModule module to configure the settings and can keep the AppModule small.

import { NgModule } from '@angular/core'; import { BrowserModule } from '@angular/platform-browser'; import { RouterModule } from '@angular/router'; import { AppComponent } from './app.component'; import { AuthConfigModule } from './auth-config.module'; import { HomeComponent } from './home/home.component'; import { UnauthorizedComponent } from './unauthorized/unauthorized.component'; @NgModule({ declarations: [AppComponent, HomeComponent, UnauthorizedComponent], imports: [ BrowserModule, RouterModule.forRoot([ { path: '', redirectTo: 'home', pathMatch: 'full' }, { path: 'home', component: HomeComponent }, { path: 'forbidden', component: UnauthorizedComponent }, { path: 'unauthorized', component: UnauthorizedComponent }, ]), AuthConfigModule, ], providers: [], bootstrap: [AppComponent], }) export class AppModule {}

The AuthConfigModule configures the two identity providers. Both clients are setup to use the Open ID Connect code flow with PKCE (proof key for code exchange). Each secure token server (STS) has it’s specifics configurations which are required to use this flow. These configurations must match the corresponding client configuration on the token service of the identity provider. All identity providers have their own flavour of the Open ID Connect specification and also support different features. Some of these are worse, some are better. When implementing SPA applications, you should validate the required features supported by the identity provider.

import { NgModule } from '@angular/core'; import { AuthModule, LogLevel } from 'angular-auth-oidc-client'; @NgModule({ imports: [ AuthModule.forRoot({ config: [ { authority: 'https://offeringsolutions-sts.azurewebsites.net', redirectUrl: window.location.origin, postLogoutRedirectUri: window.location.origin, clientId: 'angularCodeRefreshTokens', scope: 'openid profile email taler_api offline_access', responseType: 'code', silentRenew: true, useRefreshToken: true, logLevel: LogLevel.Debug, }, { authority: 'https://dev-damienbod.eu.auth0.com', redirectUrl: window.location.origin, postLogoutRedirectUri: window.location.origin, clientId: 'Ujh5oSBAFr1BuilgkZPcMWEgnuREgrwU', scope: 'openid profile offline_access auth0-user-api-spa', responseType: 'code', silentRenew: true, useRefreshToken: true, logLevel: LogLevel.Debug, customParamsAuthRequest: { audience: 'https://auth0-api-spa', }, customParamsRefreshTokenRequest: { scope: 'openid profile offline_access auth0-user-api-spa', }, }, ], }), ], exports: [AuthModule], }) export class AuthConfigModule {}

The checkAuthMultiple function is used to initialize the authentication state and begin the flows or whatever. An SPA application requires some route when this logic can run before the guards execute. This is not a server rendered application where the security runs in the trusted backend. The initializing logic setups the state and handles callbacks form the different authentication flows. Great care should be taken when using guards. If a guard is applied to the default route, you need to ensure that the initialization logic runs first. The checkAuthMultiple function should only be called once in the application.

import { Component, OnInit } from '@angular/core'; import { OidcSecurityService } from 'angular-auth-oidc-client'; @Component({ selector: 'app-root', templateUrl: 'app.component.html', }) export class AppComponent implements OnInit { constructor(public oidcSecurityService: OidcSecurityService) {} ngOnInit() { this.oidcSecurityService.checkAuthMultiple().subscribe(([{ isAuthenticated, userData, accessToken }]) => { console.log('Authenticated', isAuthenticated); }); } }

The sign-in logout, revoke tokens can then be implemented anywhere for the different identity providers. Each client configuration has its own configuration ID which can be used to run or start the required authentication flow, logout or whatever.

import { Component, OnInit } from '@angular/core'; import { AuthenticatedResult, OidcClientNotification, OidcSecurityService, OpenIdConfiguration, UserDataResult, } from 'angular-auth-oidc-client'; import { Observable } from 'rxjs'; @Component({ selector: 'app-home', templateUrl: 'home.component.html', }) export class HomeComponent implements OnInit { configurations: OpenIdConfiguration[]; userDataChanged$: Observable<OidcClientNotification<any>>; userData$: Observable<UserDataResult>; isAuthenticated$: Observable<AuthenticatedResult>; constructor(public oidcSecurityService: OidcSecurityService) {} ngOnInit() { this.configurations = this.oidcSecurityService.getConfigurations(); this.userData$ = this.oidcSecurityService.userData$; this.isAuthenticated$ = this.oidcSecurityService.isAuthenticated$; } login(configId: string) { this.oidcSecurityService.authorize(configId); } forceRefreshSession() { this.oidcSecurityService.forceRefreshSession().subscribe((result) => console.warn(result)); } logout(configId: string) { this.oidcSecurityService.logoff(configId); } refreshSessionId4(configId: string) { this.oidcSecurityService.forceRefreshSession(null, configId).subscribe((result) => console.log(result)); } refreshSessionAuth0(configId: string) { this.oidcSecurityService .forceRefreshSession({ scope: 'openid profile offline_access auth0-user-api-spa' }, configId) .subscribe((result) => console.log(result)); } logoffAndRevokeTokens(configId: string) { this.oidcSecurityService.logoffAndRevokeTokens(configId).subscribe((result) => console.log(result)); } revokeRefreshToken(configId: string) { this.oidcSecurityService.revokeRefreshToken(null, configId).subscribe((result) => console.log(result)); } }

The application can be run and you can logout or use the required configuration. Underneath the identity has not signed in with IdentityServer and the Auth0 client has be authenticated. Each identity provider is independent from the other.

The multiple identity provider support is implemented using redirects. You could also implement all of this using popups depending on you use cases.

Links:

https://github.com/damienbod/angular-auth-oidc-client

https://auth0.com/

https://github.com/IdentityServer/IdentityServer4

https://duendesoftware.com/

Sunday, 01. August 2021

Jon Udell

A beautiful power tool to scrape, clean, and combine data

Labels like “data scientist” and “data journalist” connote an elite corps of professionals who can analyze data and use it to reason about the world. There are elite practitioners, of course, but since the advent of online data a quarter century ago I’ve hoped that every thinking citizen of the world (and of the web) … Continue reading A beautiful power tool to scrape, clean, and combine data

Labels like “data scientist” and “data journalist” connote an elite corps of professionals who can analyze data and use it to reason about the world. There are elite practitioners, of course, but since the advent of online data a quarter century ago I’ve hoped that every thinking citizen of the world (and of the web) could engage in similar analysis and reasoning.

That’s long been possible for those of us with the ability to wrangle APIs and transform data using SQL, Python, or another programming language. But even for us it hasn’t been easy. When I read news stories that relate to the generation of electric power in California, for example, questions occur to me that I know I could illuminate by finding, transforming, and charting sources of web data:

– How can I visualize the impact of shutting down California’s last nuclear plant?

– What’s the relationship between drought and hydro power?

All the ingredients are lying around in plain sight, but the effort required to combine them winds up being more trouble than it’s worth. And that’s for me, a skilled longtime scraper and transformer of web data. For you — even if you’re a scientist or a journalist! — that may not even be an option.

Enter Workbench, a web app with the tagline: “Scrape, clean, combine and analyze data without code.” I’ve worked with tools in the past that pointed the way toward that vision. DabbleDB in 2005 (now gone) and Freebase Gridworks in 2010 (still alive as Open Refine) were effective ways to cut through data friction. Workbench carries those ideas forward delightfully. It enables me to fly through the boring and difficult stuff — the scraping, cleaning, and combining — in order to focus on what matters: the analysis.

Here’s the report that I made to address the questions I posed above. It’s based on a workflow that you can visit and explore as I describe it here. (If you create your own account you can clone and modify.)

The workflow contains a set of tabs; each tab contains a sequence of steps; each step transforms a data set and displays output as a table or chart. When you load the page the first tab runs, and the result of its last step is displayed. In this case that’s the first chart shown in the report:

As in a Jupyter notebook you can run each step individually. Try clicking step 1. You’ll see a table of data from energy.ca.gov. Notice that step 1 is labeled Concatenate tabs. If you unfurl it you’ll see that it uses another tab, 2001-2020 scraped, which in turn concatenates two other tabs, 2001-2010 scraped and 2011-2020 scraped. Note that I’ve helpfully explained that in the optional comment field above step 1.

Each of the two source tabs scrapes a table from the page at energy.ca.gov. As I note in the report, it wasn’t necessary to scrape those tables since the data are available as an Excel file that can be downloaded, then uploaded to Workbench (as I’ve done in the tab named energy.ca.gov xslx). I scraped them anyway because that web page presents a common challenge: the data appear in two separate HTML tables. That’s helpful to the reader but frustrating to an analyst who wants to use the data. Rapid and fluid combination of scraped tables is grease for cutting through data friction; Workbench supplies that grease.

Now click step 2 in the first tab. It’s the last step, so you’re back to the opening display of the chart. Unfurl it and you’ll see the subset of columns included in the chart. I’ve removed some minor sources, like oil and waste heat, in order to focus on major ones. Several details are notable here. First: colors. The system provides a default palette but you can adjust it. Black wasn’t on the default palette but I chose that for coal.

Second, grand total. The data set doesn’t include that column, and it’s not something I needed here. But in some situations I’d want it, so the system offers it as a choice. That’s an example of the attention to detail that pervades every aspect of Workbench.

Third, Vega. See the triple-dot button above the legend in the chart? Click it, then select Open in Vega Editor, and when you get there, click Run. Today I learned that Vega is:

a declarative format for creating, saving, and sharing visualization designs. With Vega, visualizations are described in JSON, and generate interactive views using either HTML5 Canvas or SVG.

Sweet! I think I’ll use it in my own work to simplify what I’ve recently (and painfully) learned how to do with D3.js. It’s also a nice example of how Workbench prioritizes openness, reusability, and reproducibility in every imaginable way.

I use the chart as the intro to my report, which is made with an elegant block editor in which you can combine tables and charts from any of your tabs with snippets of text written in markdown. There I begin to ask myself questions, adding tabs to marshal supporting evidence and sourcing evidence from tabs into the report.

My first question is about the contribution that the Diablo Canyon nuclear plant has been making to the overall mix. In the 2020 percentages all major sources tab I start in step 1 by reusing the tab 2001-2020 scraped. Step 2 filters the columns to just the same set of major sources shown in the chart. I could instead apply that step in 2001-2020 scraped and avoid the need to select columns for the chart. Since I’m not sure how that decision might affect downstream analysis I keep all the columns. If I change my mind it’s easy to push the column selection upstream.

Workbench not only makes it possible to refactor a workflow, it practically begs you to do that. When things go awry, as they inevitably will, it’s no problem. You can undo and redo the steps in each tab! You won’t see that in the read-only view but if you create your own account, and duplicate my workflow in it, give it a try. With stepwise undo/redo, exploratory analysis becomes a safe and stress-free activity.

At step 2 of 2020 percentages all major sources we have rows for all the years. In thinking about Diablo Canyon’s contribution I want to focus on a single reference year so in step 3 I apply a filter that selects just the 2020 row. Here’s the UX for that.

In situations like this, where you need to select one or more items from a list, Workbench does all the right things to minimize tedium: search if needed, start from all or none depending on which will be easier, then keep or remove selected items, again depending on which will be easier.

In step 4 I include an alternate way to select just the 2020 row. It’s a Select SQL step that says select * from input where Year = '2020'. That doesn’t change anything here; I could omit either step 3 or step 4 without affecting the outcome; I include step 4 just to show that SQL is available at any point to transform the output of a prior step.

Which is fantastic, but wait, there’s more. In step 5 I use a Python step to do the same thing in terms of a pandas dataframe. Again this doesn’t affect the outcome, I’m just showing that Python is available at any point to transform the output of a prior step. Providing equivalent methods for novices and experts, in a common interface, is extraordinarily powerful.

I’m noticing now, by the way, that step 5 doesn’t work if you’re not logged in. So I’ll show it to you here:

Step 6 transposes the table so we can reason about the fuel types. In steps 3-5 they’re columns, in step 6 they become rows. This is a commonly-needed maneuver. And while I might use the SQL in step 4 to do the row selection handled by the widget in step 3, I won’t easily accomplish the transposition that way. The Transpose step is one of the most powerful tools in the kit.

Notice at step 6 that the first column is named Year. That’s a common outcome of transposition and here necessitates step 7 in which I rename it to Fuel Type. There are two ways to do that. You can click the + Add Step button, choose the Rename columns option, drag the new step into position 7, open it, and do the renaming there.

But look:

You can edit anything in a displayed table. When I change Year to Fuel Type that way, the same step 7 that you can create manually appears automatically.

It’s absolutely brilliant.

In step 8 I use the Calculate step to add a new column showing each row’s percentage of the column sum. In SQL I’d have to think about that a bit. Here, as is true for so many routine operations like this, Workbench offers the solution directly:

Finally in step 9 I sort the table. The report includes it, and there I consider the question of Diablo Canyon’s contribution. According to my analysis nuclear power was 9% of the major sources I’ve selected, contributing 16,280 GWh in 2020. According to another energy.ca.gov page that I cite in the report, Diablo Canyon is the only remaining nuke plant in the state, producing “about 18,000 GWh.” That’s not an exact match but it’s close enough to give me confidence that reasoning about the nuclear row in the table applies to Diablo Canyon specifically.

Next I want to compare nuclear power to just the subset of sources that are renewable. That happens in the 2020 percentages renewable tab, the output of which is also included in the report. Step 1 begins with the output of 2020 percentages of all major sources. In step 2 I clarify that the 2020 column is really 2020 GWh. In step 3 I remove the percent column in order to recalculate it. In step 4 I remove rows in order to focus on just nuclear and renewables. In step 5 I recalculate the percentages. And in step 6 I make the chart that also flows through to the report.

Now, as I look at the chart, I notice that the line for large hydro is highly variable and appears to correlate with drought years. In order to explore that correlation I look for data on reservoir levels and arrive at https://cdec.water.ca.gov/. I’d love to find a table that aggregates levels for all reservoirs statewide since 2001, but that doesn’t seem to be on offer. So I decide to use Lake Mendocino as a proxy. In step 1 I scrape an HTML table with monthly levels for the lake since 2001. In step 2 I delete the first row which only has some months. In step 3 I rename the first column to Year in order to match what’s in the table I want to join with. In step 4 I convert the types of the month columns from text to numeric to enable calculation. In step 5 I calculate the average into a new column, Avg. In step 6 I select just Year and Avg.

When I first try the join in step 8 it fails for a common reason that Workbench helpfully explains:

In the other table Year looks like ‘2001’, but the text scraped from energy.ca.gov looks like ‘2,001’. That’s a common glitch that can bring an exercise like this to a screeching halt. There’s probably a Workbench way to do this, but in step 7 I use SQL to reformat the values in the Year column, removing the commas to enable the join. While there I also rename the Avg column to Lake Mendocino Avg Level. Now in tab 8 I can do the join.

In tab 9 I scale the values for Large Hydro into a new column, Scaled Large Hydro. Why? The chart I want to see will compare power generation in GWh (gigawatt hours) and lake levels in AF (acre-feet). These aren’t remotely compatible but I don’t care, I’m just looking for comparable trends. Doubling the value for Large Hydro gets close enough for the comparison chart in step 10, which also flows through to the report.

All this adds up to an astonishingly broad, deep, and rich set of features. And I haven’t even talked about the Clean text step for tweaking whitespace, capitalization, and punctuation, or the Refine step for finding and merging clusters of similar values that refer to the same things. Workbench is also simply beautiful as you can see from the screen shots here, or by visiting my workflow. When I reviewed software products for BYTE and InfoWorld it was rare to encounter one that impressed me so thoroughly.

But wait, there’s more.

At the core of my workflow there’s a set of tabs; each is a sequence of steps; some of these produce tables and charts. Wrapped around the workflow there’s the report into which I cherrypick tables and charts for the story I’m telling there.

There’s also another kind of wrapper: a lesson wrapped around the workflow. I could write a lesson that guides you through the steps I’ve described and checks that each yields the expected result. See Intro to data journalism for an example you can try. Again, it’s brilliantly well done.

So Workbench succeeds in three major ways. If you’re not a data-oriented professional, but you’re a thinking person who wants to use available web data to reason about the world, Workbench will help you power through the grunt work of scraping, cleaning, and combining that data so you can focus on analysis. If you aspire to become such a professional, and you don’t have a clue about how to do that grunt work, it will help you learn the ropes. And if you are one of the pros you’ll still find it incredibly useful.

Kudos to the Workbench team, and especially to core developers Jonathan Stray, Pierre Conti, and Adam Hooper, for making this spectacularly great software tool.

Wednesday, 28. July 2021

Margo Johnson

An Ode to Generous Networks

Reflecting on change and community in a time of personal transition. Staring out over a world of possibilities (Joshua Tree, July 2021) Navigating through this global pandemic has leveled up both my tolerance for ambiguity and my appreciation for community — including friends, family, co-workers, and global colleagues. These dual forces of change and connection are also guiding me forward as
Reflecting on change and community in a time of personal transition. Staring out over a world of possibilities (Joshua Tree, July 2021)

Navigating through this global pandemic has leveled up both my tolerance for ambiguity and my appreciation for community — including friends, family, co-workers, and global colleagues. These dual forces of change and connection are also guiding me forward as I embark on a new professional chapter.

After three deeply rewarding and challenging years with Transmute, I’m transitioning into a new role this fall.

I will continue to work in the decentralized identity and verifiable credentials ecosystem but will return to more of my impact roots as Senior Director of Product with Kiva’s Protocol team — part of an international non-profit working on digital identity as an enabler of deeper financial inclusion. I will also remain a champion and alumni advisor of Transmute, cheering the team on through next iterations of company and product development.

Making any big transition like this always surfaces the forest through the trees for me, bringing some longer term trends into focus in a way I often miss when heads down on day to day work.

In particular I’ve found myself reflecting on the philosophy and practice of open versus closed networks, and my growing conviction that open networks are a key to satisfaction and success, both for individuals and for businesses.

My working definition of “open networks” boils down to a mindset and style of engagement that assumes generosity, collaboration, and expanded connections will lead to mutual benefit. This can look like contributing to global standards and open source work, being proactive with connecting people, offering support without expectation of returned favors, and celebrating colleagues as they transition organizations (something I’ve been fortunate to experience this month with Transmute).

Open network interactions are empathetic and relational rather than transactional, rooted in a belief that we can progress and grow together, creating more opportunity in the process rather than competing for something finite.

By contrast, a more closed network approach (all too often the default) sees the gain of others as a personal loss if not compensated in some way. It is a more protective mentality that discourages transparency and informal exchange. It looks like vendor lock-in, punitive non-compete agreements, and proprietary code. To be clear, some degree of this is necessary for businesses to function today — to show value for investors, protect IP, drive competitive advantage, etc. However, taken as a default state, closed networks can crush both innovation and the spirits of individuals whose interests expand beyond the bounds of any single identity or organization.

One of the things that drew me to Transmute — and more broadly to emergent technology like decentralized identities and verifiable credentials — is that the open network style is often the default culture of engagement. In fact, this approach is necessary in the face of pre-competitive table stakes like technical interoperability as well as collaborative engagement and advocacy in the broader regulatory environment. Essentially, the adoption of this technology is far bigger than any one of us, and we have to work together to build the market and adoption we want to see in the world.

This open, generous network ethos aligns with the future I want to be part of, both with the organizations I help build and personally as a self-actualizing and social human. Part of the “crafted self” we talk about as identity workers includes the ability to be selectively multi-dimensional beings across different settings. To me this includes the freedom to traverse organizations and even industries as our interests take us, and to carry our reputation and relationships along that journey. Viewed in this way, the interoperability and autonomy values woven into the foundations of decentralized tech become further tools that can help bring to bear an even more generous world.

A huge thank you to the Transmute Team — particularly Karyl Fowler and Orie Steele — for leading the way, both as technologists and founders, in making the bet that movement across systems and organizations is a win for all parties involved (be they verifiable trade documents or prior employees). It has been a pleasure to work for and with you over the past few years.

***

If you are interested in some of the cross-network work happening in the decentralized tech space you can also check out the Decentralized Identity Foundation working groups, the Internet Identity Workshop, the Trust Over IP Foundation, and the Department of Homeland Security Silicon Valley Innovation Program (to name just a few!).

Jumping off, jumping in! (Mammoth Lakes CA, July 2021)

Tuesday, 27. July 2021

MyDigitalFootprint

Being Curious will not kill the #CDO

The last article was about being railroaded in the first 100 days, recognising that you are forced into a decision.  In this one I wanted to unpack that “data” shows that using science in an argument (say to defend against railroading) just makes the members of the team more partisan (aligned to their own confirmation bias and opinions).  As the #CDO, your job is to use data and science
The last article was about being railroaded in the first 100 days, recognising that you are forced into a decision.  In this one I wanted to unpack that “data” shows that using science in an argument (say to defend against railroading) just makes the members of the team more partisan (aligned to their own confirmation bias and opinions).  As the #CDO, your job is to use data and science, therefore in the first 100 days, with this insight, you are more likely to lose more people than win friends, lose more arguments than win and create bigger hurdles?   What I suggest, based on this work is that to overcome “proof by science” is to use curiosity to bring us together.

Image source: from a good article by Douglas Longenecker

---

Dan Kahan, a Yale behavioural economist, has spent the last decade studying whether the use of reason aggravates or reduces “partisan” beliefs. His research papers are here. His research shows that aggravation and alienation easily win, irrespective of being more liberal or conservative. The more we use our faculties for scientific thought, the more likely we are to take a strong position that aligns with our (original) political group (or thought). 

A way through this is to copy “solution journalism” which reports on ways that people and governments meaningfully respond to difficult problems and not on what the data says the problem is. Rather than use our best insights, analysis and thinking to reach the version of the “truth”, we use data to find ways to agree with others opinions in our communities. We help everyone to become curious.   I use the Peak Paradox framework as an approach to remaining curious as to where we are aligned and where there is a delta. 

What happens when we use data and science in our arguments and explain what the problem is, is that the individuals will selectively credit and discredit information in patterns that reflect their commitment to certain values. They (we) (I) assimilate what they (we)(I) want. 

Kahan, in 2014, asked over 1,500 respondents whether they agreed or disagreed with the following statement: “There is solid evidence of recent global warming due mostly to human activity such as burning fossil fuels.” They collected information on individuals political beliefs, and rated their science intelligence.” The analysis found that those with the least science intelligence actually have less partisan positions than those with the most. A Conservative with strong science intelligence will use their skills to find evidence against human-caused global warming, while a Liberal will find evidence for it (cognitive bias.) 

In the chart above, the y-axis represents the probability of a person agreeing that human activity caused climate change, and x-axis represents the percentile a person scored on the scientific knowledge test. The width of the bars shows the confidence interval for that probability.

As a CDO a most disconcerting finding is that individuals with more “scientific intelligence” are the quickest to align themselves on subjects they don’t know anything about. In one experiment, Kahan analyzed how people’s opinions on an unfamiliar subject is affected when given some basic scientific information, along with details about what people in their self-identified political group tend to believe about that subject. It turned out that those with the strongest scientific reasoning skills were the ones most likely to use the information to develop partisan opinions.

Critically Kahan’s research shows that people that score well on a measure called “scientific curiosity” actually show less partisanship, and it is this aspect we need to use.

As CDO’s, we need to move away from “truth”, “facts”, “data” and “right decisions” if we want to have a board and senior team who can become aligned.  We need to present ideas, concepts, how others are finding solutions and make our teams more curious.  Being curious appears to be the best way to bring us together - however counterintuitive that is.


-----

Whilst our ongoing agile iteration into information beings is never-ending, there are the first 100 days in the new role. But what to focus on? Well, that rose-tinted period of conflicting priorities is what </Hello, CDO!> is all about. Maintaining sanity when all else has been lost to untested data assumptions is a different problem entirely.



Questions to help frame your own paradoxes!

Questions to help frame your own paradoxes Leadership must be able to recognise the paradoxes created as they decide on “what they are optimising for.” The last article described two different starting points for the Peak Paradox model; finding paradox and living with paradox.  It is evident that the compromises we elect to live with become more focused as we agreed or decide on what we ar
Questions to help frame your own paradoxes

Leadership must be able to recognise the paradoxes created as they decide on “what they are optimising for.” The last article described two different starting points for the Peak Paradox model; finding paradox and living with paradox

It is evident that the compromises we elect to live with become more focused as we agreed or decide on what we are optimising for.  Such a focus has a benefit insomuch that the narrower and more articulate our view of optimised becomes, the more decisions can become aligned. However, the counter is that whilst a sharp focus and alignment will require less compromise for some, but it will equally increase the paradoxes/ compromises and tensions others will have to live with.  One team can be aligned to a vision but not necessarily on how to get there or live with the chosen “optimised” approach.  Stress is created by these differences and can create cracks and weaknesses in the team and culture. One vision, one culture, one team is very naive in any team more extensive than one.  

Sufficient dwell or thinking time is often not afforded to executive and leadership teams. Indeed board agendas are so packed and time so restricted there is little opportunity to debate or descent. The consequence of this reality is that insufficient time is allowed to consider the impacts of your optimising choices on others.  Therefore over the next few posts, I will offer questions at different levels as per the decision model (below.)  This is a link to a longer article on choices, decisions, and judgment as it frames; “are we asking the right questions?”   

The outer yellow ring is the focus of this article. Below are five questions per “Peak” that should help facilitate a conversation to maximise benefits within this space or improve understanding of the compromises other will have to make to come to your peak. 



As a reminder of what the peaks represent.  The original detailed definitions of each peak are here. 

Peak Individual Purpose.   At the exclusion of anything else, you are only interested in yourself.  Selflessness at the extreme.  You believe that you are sovereign (not having to ask anyone for permission or forgiveness), your voice and everything you say matters and everyone should agree. You are all-powerful and can do whatever you want and have the freedom and agency to do it.  

Peak Work Purpose.  At the exclusion of anything else, the only reason to work is to deliver as much possible value to the shareholders.  Employees, customers, the environment does not matter in terms of exploitation.  The purpose is to be the biggest and most efficient beast on the planet and able to deliver enormous returns; Shareholder Primacy at its most pure. Simple, clear, no conflict and no compromise, even to the point that rewarding the staff beyond fair would be a compromise.  Compliance is met with the minimal standard ensuring that nothing is wasted.  (note: it is not the reason an individual works, but a company works)

Peak Society Purpose.  At the exclusion of anything else, we have to deliver and ensure there is no suffering and poverty for any living thing. Humans must have equal education, health and safety.  There must be total transparency and equality.  Everything is equally shared, and on-one has more power, agency or influence than anyone else.  

Peak Human Purpose.  At the exclusion of anything else, we are here to escape death which we do by reproducing as much as we can with the broadest community we can.  We also have to adapt as fast as possible.  We have to meet our chemistry requirements to stay alive for as long as possible to adopt and reproduce at the expense of anything else.  Whilst all the peak purposes might be controversial (even to myself), saying purity of human purpose in chemistry/ biology might not go down very well. However, this is a model for framing thinking, so please go with it as it needs to be pure, and every other human purpose has conflicts with someone.

How would you answer these questions?

It should be evident that these questions try to frame you towards a natural one peak, but even at this level, you will have noticed that whilst you will align more to one peak than the others, but that you are full of paradoxes.  Ask different questions, you will align to a different north star. 





Jon Udell

Working with Postgres types

In episode 2 of this series I noted that the languages in which I’m writing Postgres functions share a common type system. It took me a while to understand how types work in the context of Postgres functions that can return sets of records and can interact with tables and materialized views. Here is a … Continue reading Working with Postgres types

In episode 2 of this series I noted that the languages in which I’m writing Postgres functions share a common type system. It took me a while to understand how types work in the context of Postgres functions that can return sets of records and can interact with tables and materialized views.

Here is a set-returning function.

create function notes_for_user_in_group( _userid text, _groupid text) returns setof annotation as $$ begin return query select * from annotation where userid = concat('acct:', _userid) and groupid = _groupid; end; $$ language plpgsql;

In this case the type that governs the returned set has already been defined: it’s the schema for the annotation table.

Column Type id uuid created timestamp without time zone updated timestamp without time zone userid text groupid text text text tags text[] shared boolean target_uri text target_uri_normalized text target_selectors jsonb references uuid[] extra jsonb text_rendered text document_id integer deleted boolean

The function returns records matching a userid and groupid. I can now find the URLs of documents most recently annotated by me.

select target_uri from notes_for_user_in_group('judell@hypothes.is', '__world__') order by created desc limit 3;

The Postgres response:

target_uri --------------------------------------------- https://news.ycombinator.com/item?id=20020501 https://www.infoworld.com/article/2886828/github-for-the-rest-of-us.html https://web.hypothes.is/help/formatting-annotations-with-markdown/ http://example.com (3 rows)

You might wonder why the function’s parameters are prefixed with underscores. That’s because variables used in functions can conflict with names of columns in tables. Since none of our column names begin with underscore, it’s a handy differentiator. Suppose the function’s signature were instead:

create function notes_for_user_in_group( userid text, groupid text)

Postgres would complain about a confict:

ERROR: column reference "userid" is ambiguous LINE 2: where userid = concat('acct:', userid) ^ DETAIL: It could refer to either a PL/pgSQL variable or a table column.

The table has userid and groupid columns that conflict with their eponymous variables. So for functions that combine variables and database values I prefix variable names with underscore.

Set-returning functions can be called in any SQL SELECT context. In the example above that context is psql, Postgres’ powerful and multi-talented REPL (read-eval-print loop). For an example of a different context, let’s cache the function’s result set in a materialized view.

create materialized view public_notes_for_judell as ( select * from notes_for_user_in_group('judell@hypothes.is', '__world__') order by created desc ) with data;

Postgres reports success by showing the new view’s record count.

SELECT 3972

The view’s type is implicitly annotation; its schema matches the one shown above; selecting target_uri from the view is equivalent to selecting target_uri from the setof annotation returned from the function notes_for_user_in_group.

select target_uri from public_notes_for_judell limit 3;

The Postgres response is the same as above.

target_uri --------------------------------------------- https://news.ycombinator.com/item?id=20020501 https://www.infoworld.com/article/2886828/github-for-the-rest-of-us.html https://web.hypothes.is/help/formatting-annotations-with-markdown/ http://example.com (3 rows)

It shows up a lot faster though! Every time you select the function’s result set, the wrapped query has to run. For this particular example that can take a few seconds. It costs the same amount of time to create the view. But once that’s done you can select its contents in milliseconds.

Now let’s define a function that refines notes_for_user_in_group by reporting the count of notes for each annotated document.

create function annotated_docs_for_user_in_group( _userid text, _groupid text) returns table ( count bigint, userid text, groupid text, url text ) as $$ begin return query select count(n.*) as anno_count, n.userid, n.groupid, n.target_uri from notes_for_user_in_group(_userid, _groupid) n group by n.userid, n.groupid, n.target_uri order by anno_count desc; end; $$ language plpgsql;

Instead of returning a setof some named type, this function returns an anonymous table. I’ve aliased the set-returning function call notes_for_user_in_group as n and used the alias to qualify the names of selected columns. That avoids another naming conflict. If you write userid instead of n.userid in the body of the function and then call it, Postgres again complains about a conflict.

ERROR: column reference "userid" is ambiguous LINE 3: userid, ^ DETAIL: It could refer to either a PL/pgSQL variable or a table column.

Here’s a sample call to our new function..

select * from annotated_docs_for_user_in_group( 'judell', 'hypothes.is', '__world__' );

The result:

count | userid | groupid | target_uri -------+--------------------------------------------- 516 | judell@hypothes.is | __world__ | http://shakespeare.mit.edu/macbeth/full.html 73 | judell@hypothes.is | __world__ | https://www.independent.co.uk/news/world/asia/india-floods-bangladesh-nepal-deaths-millions-homeless-latest-news-updates-a7919006.html 51 | judell@hypothes.is | __world__ | https://www.usatoday.com/story/news/nation-now/2017/06/16/coconut-oil-isnt-healthy-its-never-been-healthy/402719001/

Now let’s create a view based on that function.

create materialized view url_counts_for_public_notes_by_judell as ( select * from annotated_docs_for_user_in_group( 'judell@hypothes.is', '__world__' ) ) with data;

Postgres says:

SELECT 1710

When you ask for the definition of that view using the \d command in psql:

\d url_counts_for_public_notes_by_judell

It responds with the same table definition used when creating the function.

Column | Type ---------+-------- count | bigint userid | text groupid | text url | text

Behind the scenes Postgres has created this definition from the anonymous table returned by the function.

To revise the function so that it uses a named type, first create the type.

create type annotated_docs_for_user_in_group as ( count bigint, userid text, groupid text, url text );

Postgres reports success:

CREATE TYPE

Now we can use that named type in the function. Since we’re redefining the function, first drop it.

drop function annotated_docs_for_user_in_group;

Uh oh. Postgres is unhappy about that.

ERROR: cannot drop function annotated_docs_for_user_in_group(text,text) because other objects depend on it DETAIL: materialized view url_counts_for_public_notes_by_judell depends on function annotated_docs_for_user_in_group(text,text) HINT: Use DROP ... CASCADE to drop the dependent objects too.

A view that depends on a function must be recreated when the function’s signature changes. I’ll say more about this in a future episode on set-returning functions that dynamically cache their results in materialized views. For now, since the view we just created is a contrived throwaway, just drop it along with the function by using CASCADE as Postgres recommends.

drop function annotated_docs_for_user_in_group cascade;

Postgres says:

NOTICE: drop cascades to materialized view url_counts_for_public_notes_by_judell DROP FUNCTION

Now we can recreate a version of the function that returns setof annotated_docs_for_user_in_group instead of an anonymous table(...)

create function annotated_docs_for_user_in_group( _userid text, _groupid text) returns setof annotated_docs_for_user_in_group as $$ begin return query select count(n.*) as anno_count, n.userid, n.groupid, n.target_uri from notes_for_user_in_group(_userid, _groupid) n group by n.userid, n.groupid, n.target_uri order by anno_count desc; end; $$ language plpgsql;

The results are the same as above. So why do it this way? In many cases I don’t. It’s extra overhead to declare a type. And just as a view can depend on a function, a function can depend on a type. To see why you might not want such dependencies, suppose we want to also track the most recent note for each URL.

create type annotated_docs_for_user_in_group as ( count bigint, userid text, groupid text, url text, most_recent_note timestamp );

That won’t work.

ERROR: type "annotated_docs_for_user_in_group" already exists

Dropping the type won’t work either.

ERROR: cannot drop type annotated_docs_for_user_in_group because other objects depend on it DETAIL: function annotated_docs_for_user_in_group(text,text,text) depends on type annotated_docs_for_user_in_group HINT: Use DROP ... CASCADE to drop the dependent objects too.

To redefine the type you have to do a cascading drop and then recreate functions that depend on the type. If any of those views depend on dropped functions, the drop cascades to them as well and they also must be recreated. That’s why I often write functions that return table(...) rather than setof TYPE. In dynamic languages it’s convenient to work with untyped bags of values; I find the same to be true when writing functions in Postgres.

Sometimes, though, it’s useful to declare and use types. In my experience so far it makes most sense to do that in Postgres when you find yourself writing the same returns table(...) statement in several related functions. Let’s say we want a function that combines the results of annotated_docs_for_user_in_group for some set of users.

create function annotated_docs_for_users_in_group(_userids text[], _groupid text) returns setof annotated_docs_for_user_in_group as $$ begin return query with userids as ( select unnest(_userids) as userid ) select a.* from userids u join annotated_docs_for_user_in_group(u.userid, _groupid) a on a.userid = concat('acct:', u.userid); end; $$ language plpgsql;

This new function uses the SQL WITH clause to create a common table expression (CTE) that converts an inbound array of userids into a transient table-like object, named userids, with one userid per row. The new function’s wrapped SQL then joins that CTE to the set returned from annotated_docs_for_user_in_group and returns the joined result.

(You can alternatively do this in a more procedural way by creating a loop variable and marching through the array to accumulate results. Early on I used that approach but in the context of Postgres functions I’ve come to prefer the more purely SQL-like set-oriented style.)

Sharing a common type between the two functions makes them simpler to write and easier to read. More importantly it connects them to one another and to all views derived from them. If I do decide to add most_recent_note to the type, Postgres will require me to adjust all depending functions and views so things remain consistent. That can be a crucial guarantee, and as we’ll see in a future episode it’s a key enabler of an advanced caching mechanism.

1 https://blog.jonudell.net/2021/07/21/a-virtuous-cycle-for-analytics/
2 https://blog.jonudell.net/2021/07/24/pl-pgsql-versus-pl-python-heres-why-im-using-both-to-write-postgres-functions/
3 https://blog.jonudell.net/2021/07/27/working-with-postgres-types/
4 https://blog.jonudell.net/2021/08/05/the-tao-of-unicode-sparklines/
5 https://blog.jonudell.net/2021/08/13/pl-python-metaprogramming/
6 https://blog.jonudell.net/2021/08/15/postgres-and-json-finding-document-hotspots-part-1/
7 https://blog.jonudell.net/2021/08/19/postgres-set-returning-functions-that-self-memoize-as-materialized-views/
8 https://blog.jonudell.net/2021/08/21/postgres-functional-style/
9 https://blog.jonudell.net/2021/08/26/working-in-a-hybrid-metabase-postgres-code-base/
10 https://blog.jonudell.net/2021/08/28/working-with-interdependent-postgres-functions-and-materialized-views/
11 https://blog.jonudell.net/2021/09/05/metabase-as-a-lightweight-app-server/
12 https://blog.jonudell.net/2021/09/07/the-postgres-repl/

Monday, 26. July 2021

blog.deanland.com

deanland, the blog, reaches Maturity

It was over 20 years ago that deanland, the blog, first began. It ran on different blogware, actually a few different ones before settling in this Drupal version where it's been for the past (roughly) ten years. Prior to that it had been on Manila (by Userland) which I miss more and more every day. I became a wizard with Manila. It's how I taught myself HTML without knowing that's what I was

It was over 20 years ago that deanland, the blog, first began. It ran on different blogware, actually a few different ones before settling in this Drupal version where it's been for the past (roughly) ten years.

Prior to that it had been on Manila (by Userland) which I miss more and more every day. I became a wizard with Manila. It's how I taught myself HTML without knowing that's what I was doing.  

read more


Phil Windley's Technometria

Ten Reasons to Use Picos for Your Next Decentralized Programming Project

Summary: Picos are a programming model for building decentralized applications that provide significant benefits in the form of abstractions that reduce programmer effort. Here are ten eleven reasons you should use picos for your next decentralized application. Temperature Sensor Network Built from Picos I didn't start out to write a programming language that naturally supports

Summary: Picos are a programming model for building decentralized applications that provide significant benefits in the form of abstractions that reduce programmer effort. Here are ten eleven reasons you should use picos for your next decentralized application.

Temperature Sensor Network Built from Picos

I didn't start out to write a programming language that naturally supports decentralized programming using the actor-model, is cloud-native, serverless, and databaseless. Indeed, if I had, I likely wouldn't have succeeded. Instead picos evolved from a simple rule language for modifying web pages to a powerful, general-purpose programming system for building any decentralized application. This post explains what picos are and why they are a great way to build decentralized systems.

Picos are persistent compute objects. Persistence is a core feature that distinguishes picos from other programming models. Picos exhibit persistence in three ways:

Persistent identity—Picos exist, with a single identity, continuously from the moment of their creation until they are destroyed. Persistent state—Picos have persistent state that programs running in the pico can see and alter. The state is is isolated and only available inside the pico. Persistent availability—Once a pico is created, it is always on and ready to process queries and events.

Persistent identity, state, and availability make picos ideal for modeling entities of all sorts. Applications are formed from cooperating networks of picos, creating systems that better match programmer's mental models. Picos employ the actor model abstraction for distributed computation. Specifically, in response to a received message, a pico may

send messages to other picos—Picos respond to events and queries by running rules. Depending on the rules installed, a pico may raise events for itself or other picos. create other picos—Picos can create child picos and delete them. change its internal state (which can affect their behavior when the next message is received)—Each pico has a set of persistent variables that can only be affected by rules that run in response to events.

In addition to the parent-child hierarchy, picos can be arranged in a heterachical network for peer-to-peer communication and computation. A cooperating network of picos reacts to messages, changes state, and sends messages. Picos have an internal event bus for distributing those messages to rules installed in the pico. Rules in the pico are selected to run based on declarative event expressions. The pico matches events on the bus with event scenarios declared in each rule's event expressions. Any rule whose event expression matches is scheduled for execution. Executing rules may raise additional events. More detail about the event loop and pico execution model are available elsewhere.

Here are ten reasons picos are a great development environment for building decentralized applications:

Picos can be a computational node that represents or models anything: person, place, organization, smart thing, dumb thing, concept, even a pothole. Because picos encapsulate identity and state, they can be used to easily model entities of all types. Picos use a substitutable hosting model. Picos are hosted on the open-source pico engine. Picos can be moved from pico engine to pico engine without loss of functionality or a change in their identifying features1. More importantly, an application built using picos can employ picos running on multiple engines without programmer effort. Pico-based applications are scalable. Picos provide a decentralized programming model which ensures that an application can use whatever number of picos it needs without locks or multi-threading. Pico-based applications are fully sharded, meaning there is a computational node with isolated state for every entity of interest. Because these nodes can run on different engines without loss of functionality or programmer effort, pico-based applications scale fluidly. Picos provide an architectural model for trillion-node networks where literally everything is online. This provides a better, more scalable model for IoT than the current CompuServe of Things. Picos can provide high availability in spite of potentially unreliable hosting. Multiple picos, stored in many places can be used to represent a specific entity. Picos modeling a popular web page, for example, could be replicated countless times to ensure the web page is available when needed with low latency. Copies would be eventually consistent with one another and no backups would be needed. Picos are naturally concurrent without the need for locks. Each pico is an island of determinism that can be used as a building block for non-determinant decentralized systems. Each pico is isolated from any other pico, asynchronous processing is the default, and facts about the pico are published through protocols. State changes are determined by rules and respond to incoming messages seen as events. This minimizes contention and supports the efficient use of resources. Picos provide a cloud-native (internet-first) architecture that matches the application architecture to the cloud model. The pico programming model lets developers focus on business logic, not infrastructure. Applications built with picos don't merely reside in the cloud. They are architected to be performant while decentralized. Picos support reactive programming patterns that ensure applications are decoupled in both space and time to the extent possible. Picos enable stateful, databaseless programming. Picos model domain objects in code, not in database tables. Developers don't waste time setting up, configuring, or maintaining a database. Each ruleset forms a closure over the set of persistent variables used in it. Programmers simply use a variable and it is automatically persisted and available whenever a rule or function in the ruleset is executed. Picos use an extensible service model where new functionality can be layered on. Functionality within a pico is provided by rules that respond to events. An event-based programming model ensures services inside a pico are loosely coupled. And isolation of state changes between services (implemented as rulesets) inside the pico ensure a new service can be added without interfering with existing services. An event can select new rules while also selecting existing rules without the programmer changing the control flow2. Picos naturally support a Reactive programming model. Reactive programming with picos directly addresses the challenges of building decentralized applications through abstractions, programming models, data handling, protocols, interaction schemes, and error handling3. In a pico application, distribution is first class and decentralization is natural. You can read more about this in Building Decentralized Applications with Pico Networks and Reactive Programming Patterns: Examples from Fuse. Picos provide better control over terms, apps, and data. This is a natural result of the pico model where each thing has a closure over services and data. Picos cleanly separate the data for different entities. Picos, representing a specific entity, and microservices, representing a specific business capability within the pico, provide fine grained control over data and its processing. For example, if you sell your car, you can transfer the vehicle pico to the new owner, after deleting the trip service, and its associated data, while leaving untouched the maintenance records, which are stored as part of the maintenance service in the pico.

And a bonus reason for using picos:

Picos provide a better model for building the Internet of Things. Picos are an antidote to the CompuServe of Things because they provide a scalable, decentralized model for connecting everything. We built a connected car platform called Fuse to prove this model works (read more about Fuse). Picos are a natural building block for the self-sovereign internet of things (SSIoT) and can easily model necessary IoT relationship. Picos create an IoT architecture that allows interoperable interactions between devices from different manufacturers. Use Picos

If you're intrigued and want to get started with picos, there's a Quickstart along with a series of lessons. If you need help, contact me and we'll get you added to the Picolabs Slack. We'd love to help you use picos for your next distributed application.

If you're intrigued by the pico engine, the pico engine is an open source project licensed under a liberal MIT license. You can see current issues for the pico engine here. Details about contributing to the engine are in the repository's README.

Notes The caveat on this statement is that pico engines currently use URLs to identify channels used for inter-pico communication. Moving to a different engine could change the URLs that identify channels because the domain name of the engine changes. These changes can be automated. Future developments on the roadmap will reduce the use of domain names in the pico engine to make moving picos from engine to engine even easier. Note that while rules within a ruleset are guaranteed to execute in the order they appear, rules selected from different rulesets for the same event offer no ordering guarantee. When ordering is necessary, this can be done using rule chaining, guard rules, and idempotence. I first heard this expressed by Jonas Bonér in his Reactive Summit 2020 Keynote.

Tags: picos iot programming decentralization fuse ssiot


Hyperonomy Digital Identity Lab

Bootstrapping a VDR-based Fully Decentralized Object (FDO)/Credential Platform: VON Example

Michael Herman (Trusted Digital Web) 8:35 PMWhat are the common/known strategies for bootstrapping a VDR-based decentralized credential/object platform? …asked naively on purpose. Strategies for placing the first/initial DIDs in the VDR?  …presumably purposed to be the initial Issuer(s) of verifiable … Continue reading →

Michael Herman (Trusted Digital Web) 8:35 PM
What are the common/known strategies for bootstrapping a VDR-based decentralized credential/object platform? …asked naively on purpose.

Strategies for placing the first/initial DIDs in the VDR?  …presumably purposed to be the initial Issuer(s) of verifiable identifiers on the platform?

Best regards,
Michael Herman
Far Left Self-Sovereignist

Stephen Curran 5:37 PM
In Hyperledger Indy, which is a permissioned public network, the first transactions are a DID for one of the  “SuperUser’s (aka “Trustee”) of the network, and DIDs for the initial node operators that verify the transactions.  From there, DIDs for additional nodes are added, DIDs for other Trustees and then DIDs of other types of users (Endorsers, authors), who in turn create other DIDs and object types. 
If you look at von-network (https://github.com/bcgov/von-network) you can spin up a little network (4 nodes in docker) and see the transactions that are used to start the network. In that, the seed for the Trustee DID is well known, so once you’ve started the von-network, you can control it. In a “real” network, that seed (and associated private key) would of course be protected by that first Trustee.
For Sovrin, a ceremony was video’d of all the initial Trustees and Stewards (node operators) when MainNet was started in 2017.

VON Blockchain Explorer

Reference: http://greenlight.bcovrin.vonx.io/browse/domain

Reference: http://greenlight.bcovrin.vonx.io/browse/pool

Initial DID Transactions Initial Node Transactions First SCHEMA Transaction

Damien Bod

Securing ASP.NET Core Razor Pages, Web APIs with Azure B2C external and Azure AD internal identities

This article shows how to implement an ASP.NET Core Razor page to authenticate against Azure B2C and use Web APIs from a second ASP.NET Core application which are also protected using Azure B2C App registrations. Azure B2C uses the signin, signup user flow and allows identities to authenticate using an Azure AD single tenant. Two […]

This article shows how to implement an ASP.NET Core Razor page to authenticate against Azure B2C and use Web APIs from a second ASP.NET Core application which are also protected using Azure B2C App registrations. Azure B2C uses the signin, signup user flow and allows identities to authenticate using an Azure AD single tenant. Two APIs are implemented, one for users and one for administrators. Only identities from the Azure AD tenant can use the administrator API. The authorization implementation which forces this, is supported using an ASP.NET Core policy with a handler.

Code: https://github.com/damienbod/azureb2c-fed-azuread

Setup

The ASP.NET Core applications only use Azure B2C to authenticate and authorize. An ASP.NET Core Razor page application is used for the UI, but this can be any SPA, Blazor app or whatever the preferred tech stack is. The APIs are implemented using ASP.NET Core and this uses Azure B2C to validate and authorize the access tokens. The application accepts two different access tokens from the same Azure B2C identity provider. Each API has a separate scope from the associating Azure App registration. The Admin API uses claims specific to the Azure AD identity to authorize only Azure AD internal users. Other identities cannot use this API and this needs to be validated. The Azure B2C identity provider federates to the Azure AD single tenant App registration.

Setup Azure B2C App registrations

Three Azure App registrations were created in Azure B2C to support the setup above. Two for the APIs and one for the UI. The API Azure App registrations are standard with just a scope definition. The scope access_as_user was exposed in both and the APIs can be used for user access.

The UI Azure App registration is setup to use an Azure B2C user flow and will have access to both APIs. You need to select the options with the user flows.

Add the APIs to the permissions of the Azure app registration for the UI application.

Setup Azure AD App registration

A single tenant Azure App registration needs to be created in the Azure AD for the internal or admin users. The redirect URL for this is https://”your-tenant-specific-path”/oauth2/authresp. This will be used from an Azure B2C social login using the Open ID Connect provider. You also need to define a user secret and use this later. At present only secrets can be defined in this UI. This is problematic because the secrets have a max expiry of 2 years, if defining this in the portal.

Setup Azure B2C identity provider

A custom identity provider needs to be created to access the single tenant Azure AD for the admin identity authentication. Select the Identity providers in Azure B2C and create a new Open ID Connect custom IDP. Add the data to match the Azure App registration created in the previous step.

Setup Azure B2C user flow

Now a signin, signup user flow can be created to implement the Azure B2C authentication process and the registration process. This will allow local Azure B2C guest users and also the internal administrator users from the Azure AD tenant. The idp claim is required and idp_access_token claim if you require user data from the Azure AD identity. Add the required claims when creating the user flow. The claims can be added when creating the user flow in the User attributes and token claims section. Select the custom Open ID Connect provider and add this to the flow as well.

The user flow is now setup. The Azure App registrations can now be used to login and use either API as required. The idp and the idp_access_token are added for the Azure AD sign-in and this can be validated when using the admin API.

Implementing ASP.NET Core Razor page with Microsoft.Identity.Web

The ASP.NET Core application is secured using the Microsoft.Identity.Web and the Microsoft.Identity.Web.UI Nuget packages. These packages implement the Open ID connect clients and handles the Azure B2C specific client handling. The AddMicrosoftIdentityWebAppAuthentication method is used to add this and the AzureAdB2C configuration is defined to read the configuration from the app.settings, user secrets, key vault or whatever deployment is used. The rest is standard ASP.NET Core setup.

public void ConfigureServices(IServiceCollection services) { services.AddTransient<AdminApiOneService>(); services.AddTransient<UserApiOneService>(); services.AddHttpClient(); services.AddOptions(); string[] initialScopes = Configuration.GetValue<string>("UserApiOne:ScopeForAccessToken")?.Split(' '); services.AddMicrosoftIdentityWebAppAuthentication(Configuration, "AzureAdB2C") .EnableTokenAcquisitionToCallDownstreamApi(initialScopes) .AddInMemoryTokenCaches(); services.AddRazorPages().AddMvcOptions(options => { var policy = new AuthorizationPolicyBuilder() .RequireAuthenticatedUser() .Build(); options.Filters.Add(new AuthorizeFilter(policy)); }).AddMicrosoftIdentityUI(); } public void Configure(IApplicationBuilder app, IWebHostEnvironment env) { if (env.IsDevelopment()) { app.UseDeveloperExceptionPage(); } else { app.UseExceptionHandler("/Error"); app.UseHsts(); } app.UseHttpsRedirection(); app.UseStaticFiles(); app.UseRouting(); app.UseAuthentication(); app.UseAuthorization(); app.UseEndpoints(endpoints => { endpoints.MapRazorPages(); endpoints.MapControllers(); }); }

The Micorosoft.Identity.Web package uses the AzureAdB2C settings for the configuration. This example is using Azure B2C and the configuration for Azure B2C is different to an Azure AD configuration. The Instance MUST be set to domain of the Azure B2C tenant and the SignUpSignInPolicyId must be set to use the user flow as required. A signin, signup user flow is used here. The rest is common for be Azure AD and Azure B2C settings. The ScopeForAccessToken matches the two Azure App registrations created for the APIs.

"AzureAdB2C": { "Instance": "https://b2cdamienbod.b2clogin.com", "ClientId": "8cbb1bd3-c190-42d7-b44e-42b20499a8a1", "Domain": "b2cdamienbod.onmicrosoft.com", "SignUpSignInPolicyId": "B2C_1_signup_signin", "TenantId": "f611d805-cf72-446f-9a7f-68f2746e4724", "CallbackPath": "/signin-oidc", "SignedOutCallbackPath ": "/signout-callback-oidc" }, "UserApiOne": { "ScopeForAccessToken": "https://b2cdamienbod.onmicrosoft.com/723191f4-427e-4f77-93a8-0a62dac4e080/access_as_user", "ApiBaseAddress": "https://localhost:44395" }, "AdminApiOne": { "ScopeForAccessToken": "https://b2cdamienbod.onmicrosoft.com/5f4e8bb1-3f4e-4fc6-b03c-12169e192cd7/access_as_user", "ApiBaseAddress": "https://localhost:44395" },

The Admin Razor page uses the AuthorizeForScopes to authorize for the API it uses. This Razor page uses the API service to access the admin API. No authorization is implemented in the UI to validate the identity. Normally the page would be hidden if the identity is not an administrator , I left this out in so that it is easier to validate this in the API as this is only a demo.

namespace AzureB2CUI.Pages { [AuthorizeForScopes(Scopes = new string[] { "https://b2cdamienbod.onmicrosoft.com/5f4e8bb1-3f4e-4fc6-b03c-12169e192cd7/access_as_user" })] public class CallAdminApiModel : PageModel { private readonly AdminApiOneService _apiService; public JArray DataFromApi { get; set; } public CallAdminApiModel(AdminApiOneService apiService) { _apiService = apiService; } public async Task OnGetAsync() { DataFromApi = await _apiService.GetApiDataAsync().ConfigureAwait(false); } } }

The API service uses the ITokenAcquisition to get an access token for the defined scope. If the identity and the Azure App registration are authorized to access the API, then an access token is returned for the identity. This is sent using a HttpClient created using the IHttpClientFactory interface.

using Microsoft.Extensions.Configuration; using Microsoft.Identity.Web; using Newtonsoft.Json.Linq; using System; using System.Net.Http; using System.Net.Http.Headers; using System.Threading.Tasks; namespace AzureB2CUI { public class AdminApiOneService { private readonly IHttpClientFactory _clientFactory; private readonly ITokenAcquisition _tokenAcquisition; private readonly IConfiguration _configuration; public AdminApiOneService(IHttpClientFactory clientFactory, ITokenAcquisition tokenAcquisition, IConfiguration configuration) { _clientFactory = clientFactory; _tokenAcquisition = tokenAcquisition; _configuration = configuration; } public async Task<JArray> GetApiDataAsync() { var client = _clientFactory.CreateClient(); var scope = _configuration["AdminApiOne:ScopeForAccessToken"]; var accessToken = await _tokenAcquisition.GetAccessTokenForUserAsync(new[] { scope }).ConfigureAwait(false); client.BaseAddress = new Uri(_configuration["AdminApiOne:ApiBaseAddress"]); client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", accessToken); client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json")); var response = await client.GetAsync("adminaccess").ConfigureAwait(false); if (response.IsSuccessStatusCode) { var responseContent = await response.Content.ReadAsStringAsync().ConfigureAwait(false); var data = JArray.Parse(responseContent); return data; } throw new ApplicationException($"Status code: {response.StatusCode}, Error: {response.ReasonPhrase}"); } } }

Implementing the APIs with Microsoft.Identity.Web

The ASP.NET Core project implements the two separate APIs using separate authentication schemes and policies. The AddMicrosoftIdentityWebApiAuthentication method configures services for the user API using the default JWT scheme “Bearer” and the second scheme is setup for the “BearerAdmin” JWT bearer auth for the admin API. All API calls require an authenticated user which is setup in the AddControllers using a global policy. The AddAuthorization method is used to add an authorization policy for the admin API. The IsAdminHandler handler is used to fulfil the IsAdminRequirement requirement.

public void ConfigureServices(IServiceCollection services) { services.AddHttpClient(); services.AddOptions(); JwtSecurityTokenHandler.DefaultInboundClaimTypeMap.Clear(); // IdentityModelEventSource.ShowPII = true; services.AddMicrosoftIdentityWebApiAuthentication( Configuration, "AzureB2CUserApi"); services.AddMicrosoftIdentityWebApiAuthentication( Configuration, "AzureB2CAdminApi", "BearerAdmin"); services.AddControllers(options => { var policy = new AuthorizationPolicyBuilder() .RequireAuthenticatedUser() // disabled this to test with users that have no email (no license added) // .RequireClaim("email") .Build(); options.Filters.Add(new AuthorizeFilter(policy)); }); services.AddSingleton<IAuthorizationHandler, IsAdminHandler>(); services.AddAuthorization(options => { options.AddPolicy("IsAdminRequirementPolicy", policyIsAdminRequirement => { policyIsAdminRequirement.Requirements.Add(new IsAdminRequirement()); }); }); }

The IsAdminHandler class checks for the idp claim and validates that the single tenant we require for admin identities is used to sign-in. The access token needs to be validated that the token was issued by our Azure B2C and that it has the correct scope. Since this is done in using the Microsoft.Identity.Web attributes, we don’t need to do this here.

public class IsAdminHandler : AuthorizationHandler<IsAdminRequirement> { protected override Task HandleRequirementAsync( AuthorizationHandlerContext context, IsAdminRequirement requirement) { if (context == null) throw new ArgumentNullException(nameof(context)); if (requirement == null) throw new ArgumentNullException(nameof(requirement)); var claimIdentityprovider = context.User.Claims.FirstOrDefault(t => t.Type == "idp"); // check that our tenant was used to signin if (claimIdentityprovider != null && claimIdentityprovider.Value == "https://login.microsoftonline.com/7ff95b15-dc21-4ba6-bc92-824856578fc1/v2.0") { context.Succeed(requirement); } return Task.CompletedTask; } }

The AdminAccessController class is used to provide the admin data for admin identities. The BearerAdmin scheme is required and the IsAdminRequirementPolicy policy. The access token admin scope is also validated.

[Authorize(AuthenticationSchemes = "BearerAdmin", Policy = "IsAdminRequirementPolicy")] [AuthorizeForScopes(Scopes = new string[] { "api://5f4e8bb1-3f4e-4fc6-b03c-12169e192cd7/access_as_user" })] [ApiController] [Route("[controller]")] public class AdminAccessController : ControllerBase { [HttpGet] public List<string> Get() { string[] scopeRequiredByApi = new string[] { "access_as_user" }; HttpContext.VerifyUserHasAnyAcceptedScope(scopeRequiredByApi); return new List<string> { "admin data" }; } }

The user API also validates the access token, this time using the default Bearer scheme. No policy is required here, so only the default global authorization filter is used. The user API scope is validated.

[Authorize(AuthenticationSchemes = "Bearer")] [AuthorizeForScopes(Scopes = new string[] { "api://723191f4-427e-4f77-93a8-0a62dac4e080/access_as_user" })] [ApiController] [Route("[controller]")] public class UserAccessController : ControllerBase

When the application is run, the Azure B2C user flow is used to authenticate and internal or external users can sign-in, sign-up. This view can be customized to match your styles.

Admins can use the admin API and the guest users can use the user APIs.

Notes

This works but it can be improved and there are other ways to achieve this setup. If you require only a subset of identities from the Azure AD tenant, an enterprise app can be used to define the users which can use the Azure AD App registration. Or you can do this with an Azure group and assign this to the app and the users to the group.

You should also force MFA in the application for admins by validating the claims in the token and also the client ID which the token was created for. (as well as in the Azure AD tenant.)

Azure B2C is still using version one access tokens and seems and the federation to Azure AD does not use PKCE.

The Open ID Connect client requires a secret to access the Azure AD App registration. This can only be defined for a max of two years and it is not possible to use managed identities or a certificate. This means you would need to implement a secret rotation script or something so that to solution does not stop working. This is not ideal in Azure and is solved better in other IDPs. It should be possible to define long living secrets using the Powershell module and you update then with every release etc.

It would also be possible to use the Graph API to validate the identity accessing the admin API, user API.

Azure B2C API connectors could also be used to add extra claims to the tokens for usage in the application.

Links:

https://docs.microsoft.com/en-us/azure/active-directory-b2c/overview

https://docs.microsoft.com/en-us/azure/active-directory-b2c/identity-provider-azure-ad-single-tenant?pivots=b2c-user-flow

https://github.com/AzureAD/microsoft-identity-web

https://docs.microsoft.com/en-us/azure/active-directory/develop/microsoft-identity-web

https://docs.microsoft.com/en-us/azure/active-directory-b2c/identity-provider-local

https://docs.microsoft.com/en-us/azure/active-directory/

https://docs.microsoft.com/en-us/aspnet/core/security/authentication/azure-ad-b2c

https://github.com/azure-ad-b2c/azureadb2ccommunity.io

https://github.com/azure-ad-b2c/samples

Sunday, 25. July 2021

Here's Tom with the Weather

At the cloak room

At the cloak room

Saturday, 24. July 2021

Jon Udell

pl/pgsql versus pl/python? Here’s why I’m using both to write Postgres functions.

In A virtuous cycle for analytics I noted that our library of Postgres functions is written in two languages: Postgres’ built-in pl/pgsql and the installable alternative pl/python. These share a common type system and can be used interchangeably. Here’s a pl/pgsql classifier that tries to match the name of a course against a list of … Continue reading pl/pgsql versus pl/python? Here’s why I’m using

In A virtuous cycle for analytics I noted that our library of Postgres functions is written in two languages: Postgres’ built-in pl/pgsql and the installable alternative pl/python. These share a common type system and can be used interchangeably.

Here’s a pl/pgsql classifier that tries to match the name of a course against a list of patterns that characterize the humanities.

create function humanities_classifier(course_name text) returns boolean as $$ begin return lower(course_name) ~ any(array[ 'psych', 'religio', 'soci' ]); end; $$ language plpgsql; # select humanities_classifier('Religious Studies 101') as match; match ----- t # select humanities_classifier('Comparative Religions 200') as match; match ----- t

Here is that same classifier in Python.

create function humanities_classifier(course_name text) returns boolean as $$ sql = f""" select lower('{course_name}') ~ any(array[ 'psych', 'religio', 'soci' ]) as match""" results = plpy.execute(sql) return results[0]['match'] $$ language plpython3u; # select humanities_classifier('Religious Studies 101') as match; match ----- t # select humanities_classifier('Comparative Religions 200') as match; match ----- t

The results are exactly the same. In this case, Python is only wrapping the SQL used in the orginal function and interpolating course_name into it. So why use pl/python here? I wouldn’t. The pl/pgsql version is cleaner and simpler because the SQL body doesn’t need to be quoted and course_name doesn’t need to be interpolated into it.

Here’s a more Pythonic version of the classifier.

create function humanities_classifier(course_name text) returns boolean as $$ import re regexes = [ 'psych', 'religio', 'soci' ] matches = [r for r in regexes if re.search(r, course_name, re.I)] return len(matches) $$ language plpython3u;

There’s no SQL here, this is pure Python. Is there any benefit to doing things this way? In this case probably not. The native Postgres idiom for matching a string against a list of regular expressions is cleaner and simpler than the Python technique shown here. A Python programmer will be more familiar with list comprehensions than with the Postgres any and ~ operators but if you’re working in Postgres you’ll want to know about those, and use them not just in functions but in all SQL contexts.

What about performance? You might assume as I did that a pl/pgsql function is bound to be way faster than its pl/python equivalent. Let’s check that assumption. This SQL exercises both flavors of the function, which finds about 500 matches in a set of 30,000 names.

with matching_courses as ( select humanities_classifier(name) as match from lms_course_groups ) select count(*) from matching_courses where match;

Here are the results for three runs using each flavor of the function:

pl/pgsql: 159ms, 201ms, 125ms pl/python: 290ms, 255ms, 300ms

The Python flavor is slower but not order-of-magnitude slower; I’ve seen cases where a pl/python function outperforms its pl/pgsql counterpart.

So, what is special about Python functions inside Postgres? In my experience so far there are three big reasons to use it.

Python modules

The ability to wield any of Python’s built-in or loadable modules inside Postgres brings great power. That entails great responsibility, as the Python extension is “untrusted” (that’s the ‘u’ in ‘plpython3u’) and can do anything Python can do on the host system: read and write files, make network requests.

Here’s one of my favorite examples so far. Given a set of rows that count daily or weekly annotations for users in a group — so for weekly accounting each row has 52 columns — the desired result for the whole group is the element-wise sum of the rows. That’s not an easy thing in SQL but it’s trivial using numpy, and in pl/python it happens at database speed because there’s no need to transfer SQL results to an external Python program.

Metaprogramming

Functions can write and then run SQL queries. It’s overkill for simple variable interpolation; as shown above pl/pgsql does that handily without the cognitive overhead and visual clutter of poking values into a SQL string. For more advanced uses that compose queries from SQL fragments, though, pl/pgsql is hopeless. You can do that kind of thing far more easily, and more readably, in Python.

Introspection

A pl/python function can discover and use its own name. That’s the key enabler for a mechanism that memoizes the results of a function by creating a materialized view whose name combines the name of the function with the value of a parameter to the function. This technique has proven to be wildly effective.

I’ll show examples of these scenarios in later installments of this series. For now I just want to explain why I’ve found these two ways of writing Postgres functions to be usefully complementary. The key points are:

– They share a common type system.

– pl/pgsql, despite its crusty old syntax, suffices for many things.

– pl/python leverages Python’s strengths where they are most strategic

When I began this journey it wasn’t clear when you’d prefer one over the other, or why it might make sense to use both in complementary ways. This installment is what I’d like to have known when I started.


1 https://blog.jonudell.net/2021/07/21/a-virtuous-cycle-for-analytics/
2 https://blog.jonudell.net/2021/07/24/pl-pgsql-versus-pl-python-heres-why-im-using-both-to-write-postgres-functions/
3 https://blog.jonudell.net/2021/07/27/working-with-postgres-types/
4 https://blog.jonudell.net/2021/08/05/the-tao-of-unicode-sparklines/
5 https://blog.jonudell.net/2021/08/13/pl-python-metaprogramming/
6 https://blog.jonudell.net/2021/08/15/postgres-and-json-finding-document-hotspots-part-1/
7 https://blog.jonudell.net/2021/08/19/postgres-set-returning-functions-that-self-memoize-as-materialized-views/
8 https://blog.jonudell.net/2021/08/21/postgres-functional-style/
9 https://blog.jonudell.net/2021/08/26/working-in-a-hybrid-metabase-postgres-code-base/
10 https://blog.jonudell.net/2021/08/28/working-with-interdependent-postgres-functions-and-materialized-views/
11 https://blog.jonudell.net/2021/09/05/metabase-as-a-lightweight-app-server/
12 https://blog.jonudell.net/2021/09/07/the-postgres-repl/

Thursday, 22. July 2021

@_Nat Zone

東京オリンピックのショーディレクター選択に見られる組織委員会のガバナンスの欠如

(この記事は22日11時03分公開のFaceboo… The post 東京オリンピックのショーディレクター選択に見られる組織委員会のガバナンスの欠如 first appeared on @_Nat Zone.

(この記事は22日11時03分公開のFacebook記事からの転載です)
五輪のショーディレクター小林賢太郎氏のホロコーストギャグについて、サイモン・ウィーゼンタール・センターが動きましたね。

ヤフーニュースの今井佐緒里さんの記事では「緊急請願:菅首相、今日中に五輪のショーディレクター小林氏の処分を。東京五輪が永遠の汚名となる前に」としていますが1、その人がプロデュースしたショーをオリパラでやっても後から刺されかねないし、もはやショーは中止、黙祷で替える2とかじゃないですかね。コロナ犠牲者、震災犠牲者、過去のそして今も続くホロコースト3の犠牲者に向けた。

あと「処分」ですが、前から言っておりますが、これら一連のことは、オリンピック組織委員会のガバナンスの問題であることを如実に表しています4。なので、現在のガバナンス体制に賛成していた人たちはトップ含めてがっさり替えないと駄目なんだと思います。

以下、時系列を追記して行きます 2021-07-22 1:11 中山防衛副大臣への相談からSWCへの照会等に至る一連の経緯のツイート 4:12 高橋浩祐ユダヤ人大量惨殺ごっこ」五輪開会式ディレクターの小林賢太郎氏、芸人時代にホロコーストを笑いのネタに」 8:44 今井佐緒里 「緊急請願:菅首相、今日中に五輪のショーディレクター小林氏の処分を。東京五輪が永遠の汚名となる前に」 10:50 @_nat 「SWCが動いた。だからさ、オリンピック組織委員会のガバナンスの問題なんだって。上からバッサリ変えないと駄目。」 11:03 本記事 12:40 ロイター「東京五輪、開会式前日に演出担当を解任 内容を見直し」 12:48 虚構新聞「【速報】開会式、閉会式後に開催へ。トラブル続きで差し替え間に合わず(虚構新聞社/12:48発表)」 13:03 共同通信「開会式はやらせていただきたいと橋本会長

The post 東京オリンピックのショーディレクター選択に見られる組織委員会のガバナンスの欠如 first appeared on @_Nat Zone.

Wednesday, 21. July 2021

Jon Udell

A virtuous cycle for analytics

Suppose you’re a member of a team that runs a public web service. You need to help both internal and external users make sense of all the data that’s recorded as it runs. That’s been my role for the past few years, now it’s time to summarize what I’ve learned. The web service featured in … Continue reading A virtuous cycle for analytics

Suppose you’re a member of a team that runs a public web service. You need to help both internal and external users make sense of all the data that’s recorded as it runs. That’s been my role for the past few years, now it’s time to summarize what I’ve learned.

The web service featured in this case study is the Hypothesis web annotation system. The primary database, Postgres, stores information about users, groups, documents, courses, and annotations. Questions that our team needs to answer include:

– How many students created annotations last semester?

– In how many courses at each school?

Questions from instructors using Hypothesis in their courses include:

– Which passages in course readings are attracting highlights and discussion?

– Who is asking questions about those passages, and who is responding?

Early on we adopted a tool called Metabase that continues to be a pillar of our analytics system. When Metabase was hooked up to our Postgres database the team could start asking questions without leaning on developers. Some folks used the interactive query builder, while others went straight to writing SQL that Metabase passes through to Postgres.

Before long we had a large catalog of Metabase questions that query Postgres and display results as tables or charts that can be usefully arranged on Metabase dashboards. It’s all nicely RESTful. Interactive elements that can parameterize queries, like search boxes and date pickers, map to URLs. Queries can emit URLs in order to compose themselves with other queries. I came to see this system as a kind of lightweight application server in which to incubate an analytics capability that could later be expressed more richly.

Over time, and with growing amounts of data, early success with this approach gave way to two kinds of frustration: queries began to choke, and the catalog of Metabase questions became unmanageable. And so, in the time-honored tradition, we set up a data warehouse for analytics. Ours is another instance of Postgres that syncs nightly with the primary database. There are lots of other ways to skin the cat but it made sense to leverage ops experience with Postgres and I had a hunch that it would do well in this role.

To unthrottle the choking queries I began building materialized views that cache the results of Postgres queries. Suppose a query makes use of available indexes but still takes a few minutes, or maybe even an hour, to run. It still takes that long to build the corresponding materialized view, but once built other queries can use its results immediately. Metabase questions that formerly included chunks of SQL began reducing to select * from {viewname}.

This process continues to unfold in a highly productive way. Team members may or may not hit a performance wall as they try to use Metabase to answer their questions. When they do, we can convert the SQL text of a Metabase question to a Postgres materialized view that gets immediate results. Such views can join with others, and/or with underlying tables, in SQL SELECT contexts. The views become nouns in a language that expresses higher-order business concepts.

The verbs in this language turned out to be Postgres functions written in the native procedural language, pl/pgsql, and later also in its Python counterpart, pl/python. Either flavor can augment built-in Postgres library functions with user-defined functions that can return simple values, like numbers and strings, but can also return sets that behave in SQL SELECT contexts just like tables and views.

Functions were, at first, a way to reuse chunks of SQL that otherwise had to be duplicated across Metabase questions and Postgres CREATE MATERIALIZED VIEW statements. That made it possible to streamline and refactor both bodies of code and sanely maintain them.

To visualize what had now become a three-body system of sources in which Metabase questions, Postgres views, and Postgres functions can call (or link to) one another, I wrote a tool that builds a crosslinked concordance. That made it practical to reason effectively about the combined system.

Along the way I have learned how Postgres, and more broadly modern SQL, in conjunction with a tool like Metabase, can enable a team like ours to make sense of data. There’s plenty to say about the techniques I’ve evolved, and I aim to write them up over time. The details won’t interest most people, but here’s an outcome that might be noteworthy.

Team member: I had an idea that will help manage our communication with customers, and I’ve prototyped it in a Metabase question.

Toolsmith: Great! Here’s a Postgres function that encapsulates and refines your SQL. It’s fast enough for now, but if needed we can convert it into a materialized view. Now you can use that function in another Metabase question that projects your SQL across a set of customers that you can select.

That interaction forms the basis of a virtuous cycle: The team member formulates a question and does their best to answer it using Metabase; the toolsmith captures the intent and re-expresses it in a higher-level business language; the expanded language enables the team member to go farther in a next cycle.

We recognize this software pattern in the way application programmers who push a system to its limits induce systems programmers to respond with APIs that expand those limits. I suppose it’s harder to see when the application environment is Metabase and the systems environment is Postgres. But it’s the same pattern, and it is powerful.


1 https://blog.jonudell.net/2021/07/21/a-virtuous-cycle-for-analytics/
2 https://blog.jonudell.net/2021/07/24/pl-pgsql-versus-pl-python-heres-why-im-using-both-to-write-postgres-functions/
3 https://blog.jonudell.net/2021/07/27/working-with-postgres-types/
4 https://blog.jonudell.net/2021/08/05/the-tao-of-unicode-sparklines/
5 https://blog.jonudell.net/2021/08/13/pl-python-metaprogramming/
6 https://blog.jonudell.net/2021/08/15/postgres-and-json-finding-document-hotspots-part-1/
7 https://blog.jonudell.net/2021/08/19/postgres-set-returning-functions-that-self-memoize-as-materialized-views/
8 https://blog.jonudell.net/2021/08/21/postgres-functional-style/
9 https://blog.jonudell.net/2021/08/26/working-in-a-hybrid-metabase-postgres-code-base/
10 https://blog.jonudell.net/2021/08/28/working-with-interdependent-postgres-functions-and-materialized-views/
11 https://blog.jonudell.net/2021/09/05/metabase-as-a-lightweight-app-server/
12 https://blog.jonudell.net/2021/09/07/the-postgres-repl/

Tuesday, 20. July 2021

@_Nat Zone

あり得たもう一つのオリンピック開会式

今日、故あって、銀座のエルメス・ギャラリーでやって… The post あり得たもう一つのオリンピック開会式 first appeared on @_Nat Zone.

今日、故あって、銀座のエルメス・ギャラリーでやっているマチュウ・コプランによる展覧会「エキシビジョン・カッティングス」1に行ってきました。アートシーンには全く疎いのですが、前回の東京オリンピックの1964年というと、日本は世界をリードしてたころでなんですね。

大パノラマ展とか2

普通は画廊での個展は初日は華やかにレセプションとかあるものですが、これは画廊を封鎖して始まり、個展の終わりに封鎖を解いて祝って終わるという反展覧会 Anti-Exhabition。それに続く世界各地でのアートの商業化に対するアンチテーゼの動きを見て、色々考えるところがありました。

それの一つが、あり得たもう一つのオリンピック開会式。

コロナでロックダウンの時期だったんだから、それにふさわしく反開会式 Anti-Opening にして、選手は誰も入場せず、オリンピック村の自室のバルコニーにソーシャルディスタンスで並び、国立競技場にはバッハ会長他オリンピック貴族と関係者だけがいるのをずっと画面半分でTV中継、もう半分ではドローンで宿舎バルコニーを国別に写して回る。音楽はケージの4分33秒でコロナでなくなった方々と、「復興五輪」なんだから10年前の震災で亡くなった方々に黙祷を捧げる。後者に関しては、復興の現状を包み隠さず写す。

オリンピックの最中は、IOC貴族たちにカメラが密着。東京オリンピックとしてのアートタッチのドキュメンタリーを残す。

選手たちが一同に会すのはもちろん閉会式の最後までお預けです。残った人たちだけですが。残らなかった人たちは大スクリーンでバーチャル参加ですね。

最後の最後に誰か象徴的な人(誰が良いですかね)がオリンピック貴族たちを排除してアスリートたちが集まり、商業主義との決別を宣言3するとか。LAオリンピック以来ずっとではありますが、今回は特にこれだけ商業主義とアテンションエコノミーの暗黒を見せつけている4オリンピックなのでちょっと妄想してみました。

The post あり得たもう一つのオリンピック開会式 first appeared on @_Nat Zone.

Damien Bod

Using an ASP.NET Core IHostedService to run Azure Service Bus subscriptions and consumers

This post shows how Azure Service bus subscription for topics or consumers for a queue, or can be used inside an ASP.NET Core application. The Azure Service Bus client listens to events and needs to be started, stopped and registered to the topic to receive messages. An IHostedService is used for this. Code: https://github.com/damienbod/AspNetCoreServiceBus Posts […]

This post shows how Azure Service bus subscription for topics or consumers for a queue, or can be used inside an ASP.NET Core application. The Azure Service Bus client listens to events and needs to be started, stopped and registered to the topic to receive messages. An IHostedService is used for this.

Code: https://github.com/damienbod/AspNetCoreServiceBus

Posts in this series:

Using Azure Service Bus Queues with ASP.NET Core Services Using Azure Service Bus Topics in ASP.NET Core Using Azure Service Bus Topics Subscription Filters in ASP.NET Core Using Entity Framework Core to process Azure Service Messages in ASP.NET Core Using an Azure Service Bus Topic Subscription in an Azure Function Using Azure Service Bus with restricted access Using an ASP.NET Core IHostedService to run Azure Service Bus subscriptions and consumers

The ServiceBusTopicSubscription class is used to setup the Azure Service bus subscription. The class uses the ServiceBusClient to set up the message handler, the ServiceBusAdministrationClient is used to implement filters and add or remove these rules. The Azure.Messaging.ServiceBus Nuget package is used to connect to the subscription.

using Azure.Messaging.ServiceBus; using Azure.Messaging.ServiceBus.Administration; using Microsoft.Extensions.Configuration; using Microsoft.Extensions.Logging; using System; using System.Collections.Generic; using System.Linq; using System.Threading.Tasks; namespace ServiceBusMessaging { public class ServiceBusTopicSubscription : IServiceBusTopicSubscription { private readonly IProcessData _processData; private readonly IConfiguration _configuration; private const string TOPIC_PATH = "mytopic"; private const string SUBSCRIPTION_NAME = "mytopicsubscription"; private readonly ILogger _logger; private readonly ServiceBusClient _client; private readonly ServiceBusAdministrationClient _adminClient; private ServiceBusProcessor _processor; public ServiceBusTopicSubscription(IProcessData processData, IConfiguration configuration, ILogger<ServiceBusTopicSubscription> logger) { _processData = processData; _configuration = configuration; _logger = logger; var connectionString = _configuration.GetConnectionString("ServiceBusConnectionString"); _client = new ServiceBusClient(connectionString); _adminClient = new ServiceBusAdministrationClient(connectionString); } public async Task PrepareFiltersAndHandleMessages() { ServiceBusProcessorOptions _serviceBusProcessorOptions = new ServiceBusProcessorOptions { MaxConcurrentCalls = 1, AutoCompleteMessages = false, }; _processor = _client.CreateProcessor(TOPIC_PATH, SUBSCRIPTION_NAME, _serviceBusProcessorOptions); _processor.ProcessMessageAsync += ProcessMessagesAsync; _processor.ProcessErrorAsync += ProcessErrorAsync; await RemoveDefaultFilters().ConfigureAwait(false); await AddFilters().ConfigureAwait(false); await _processor.StartProcessingAsync().ConfigureAwait(false); } private async Task RemoveDefaultFilters() { try { var rules = _adminClient.GetRulesAsync(TOPIC_PATH, SUBSCRIPTION_NAME); var ruleProperties = new List<RuleProperties>(); await foreach (var rule in rules) { ruleProperties.Add(rule); } foreach (var rule in ruleProperties) { if (rule.Name == "GoalsGreaterThanSeven") { await _adminClient.DeleteRuleAsync(TOPIC_PATH, SUBSCRIPTION_NAME, "GoalsGreaterThanSeven") .ConfigureAwait(false); } } } catch (Exception ex) { _logger.LogWarning(ex.ToString()); } } private async Task AddFilters() { try { var rules = _adminClient.GetRulesAsync(TOPIC_PATH, SUBSCRIPTION_NAME) .ConfigureAwait(false); var ruleProperties = new List<RuleProperties>(); await foreach (var rule in rules) { ruleProperties.Add(rule); } if (!ruleProperties.Any(r => r.Name == "GoalsGreaterThanSeven")) { CreateRuleOptions createRuleOptions = new CreateRuleOptions { Name = "GoalsGreaterThanSeven", Filter = new SqlRuleFilter("goals > 7") }; await _adminClient.CreateRuleAsync(TOPIC_PATH, SUBSCRIPTION_NAME, createRuleOptions) .ConfigureAwait(false); } } catch (Exception ex) { _logger.LogWarning(ex.ToString()); } } private async Task ProcessMessagesAsync(ProcessMessageEventArgs args) { var myPayload = args.Message.Body.ToObjectFromJson<MyPayload>(); await _processData.Process(myPayload).ConfigureAwait(false); await args.CompleteMessageAsync(args.Message).ConfigureAwait(false); } private Task ProcessErrorAsync(ProcessErrorEventArgs arg) { _logger.LogError(arg.Exception, "Message handler encountered an exception"); _logger.LogDebug($"- ErrorSource: {arg.ErrorSource}"); _logger.LogDebug($"- Entity Path: {arg.EntityPath}"); _logger.LogDebug($"- FullyQualifiedNamespace: {arg.FullyQualifiedNamespace}"); return Task.CompletedTask; } public async ValueTask DisposeAsync() { if (_processor != null) { await _processor.DisposeAsync().ConfigureAwait(false); } if (_client != null) { await _client.DisposeAsync().ConfigureAwait(false); } } public async Task CloseSubscriptionAsync() { await _processor.CloseAsync().ConfigureAwait(false); } } }

The WorkerServiceBus class implements the IHostedService interface and uses the IServiceBusTopicSubscription interface to subscribe to an Azure Service Bus topic. The StartAsync method is used to register the subscription using the RegisterOnMessageHandlerAndReceiveMessages method. The interface provides a start, and stop and a dispose. The Azure Service Bus class is controlled using this hosted service. If needed, a periodic task could be implemented to run health checks on the client or whatever.

public class WorkerServiceBus : IHostedService, IDisposable { private readonly ILogger<WorkerServiceBus> _logger; private readonly IServiceBusConsumer _serviceBusConsumer; private readonly IServiceBusTopicSubscription _serviceBusTopicSubscription; public WorkerServiceBus(IServiceBusConsumer serviceBusConsumer, IServiceBusTopicSubscription serviceBusTopicSubscription, ILogger<WorkerServiceBus> logger) { _serviceBusConsumer = serviceBusConsumer; _serviceBusTopicSubscription = serviceBusTopicSubscription; _logger = logger; } public async Task StartAsync(CancellationToken stoppingToken) { _logger.LogDebug("Starting the service bus queue consumer and the subscription"); await _serviceBusConsumer.RegisterOnMessageHandlerAndReceiveMessages().ConfigureAwait(false); await _serviceBusTopicSubscription.PrepareFiltersAndHandleMessages().ConfigureAwait(false); } public async Task StopAsync(CancellationToken stoppingToken) { _logger.LogDebug("Stopping the service bus queue consumer and the subscription"); await _serviceBusConsumer.CloseQueueAsync().ConfigureAwait(false); await _serviceBusTopicSubscription.CloseSubscriptionAsync().ConfigureAwait(false); } public void Dispose() { Dispose(true); GC.SuppressFinalize(this); } protected virtual async void Dispose(bool disposing) { if (disposing) { await _serviceBusConsumer.DisposeAsync().ConfigureAwait(false); await _serviceBusTopicSubscription.DisposeAsync().ConfigureAwait(false); } } }

The IHostedService is added to the services in the ConfigureServices method. The AddHostedService is used to initialize this. Now the Azure Service bus subscription can be managed and consume messages from the topic subscription or a queue is used.

public void ConfigureServices(IServiceCollection services) { services.AddControllers(); var connection = Configuration.GetConnectionString("DefaultConnection"); services.AddDbContext<PayloadContext>(options => options.UseSqlite(connection)); services.AddSingleton<IServiceBusConsumer, ServiceBusConsumer>(); services.AddSingleton<IServiceBusTopicSubscription, ServiceBusTopicSubscription>(); services.AddSingleton<IProcessData, ProcessData>(); services.AddHostedService<WorkerServiceBus>(); services.AddSwaggerGen(c => { c.SwaggerDoc("v1", new OpenApiInfo { Version = "v1", Title = "Payload API", }); }); }

When the application is run, the messages can be sent to the topic and are received using the IHostedService Azure Service Bus subscription.

Links:

https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/servicebus/Azure.Messaging.ServiceBus

https://docs.microsoft.com/en-us/azure/service-bus-messaging/

https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-dotnet-get-started-with-queues

https://docs.microsoft.com/en-us/aspnet/core/fundamentals/host/hosted-services

Friday, 16. July 2021

Kerri Lemole

The Future of Open Badges is Verifiable

‘hexadaisy layout’ by Darla available at http://www.flickr.com/photos/pedrosz/2040577615 under a Attribution 2.0 Generic (CC BY 2.0) license When Open Badges was kicking off ten years ago (see the original white paper), it was conceived to be a recognition infrastructure for skills attained and achievements accomplished anywhere at any time. Badges could assert skills learned informally, forma
‘hexadaisy layout’ by Darla available at http://www.flickr.com/photos/pedrosz/2040577615 under a Attribution 2.0 Generic (CC BY 2.0) license

When Open Badges was kicking off ten years ago (see the original white paper), it was conceived to be a recognition infrastructure for skills attained and achievements accomplished anywhere at any time. Badges could assert skills learned informally, formally, really in any aspect of and through life. It was hoped that recruiters, employers, and others could evaluate the badges to find people who had skills that aligned with the opportunities being offered. It was envisioned they could work like other types of credentials such as degrees and certifications, “but with room for much more granular or diverse skill representation” and available opportunities to capture these skills.

The infrastructure for this type of recognition system was a format called a digital badge. More than just a digital sticker, the badges would be filled with metadata properties describing the skills and achievements. Then the badges could convey the metadata both as human readable and machine readable content. By having a standard to describe the metadata and how it would be verified, Open Badges became usable in many technologies and contexts. For the most part, this resulted in badges being shared online on issuing platforms and social media — more as a tool for human understanding than one that took full advantage of the potential for machine readability.

Since then, the ethos of Open Badges, recognizing skills anywhere and anytime, has strengthened. Growth has accelerated, especially in the past few years. IMS Global and Credential Engine’s Badge Count 2020 Report indicated that there’d been an 80% increase in issued badges since 2018. Its use has expanded into more formal institutional contexts, the metadata have evolved, and initiatives have risen around it including the Open Recognition Alliance, Badge Summit, and the Open Skills Network.

Over the years, internet technologies have evolved too and a credential verification model at the W3C called Verifiable Credentials has gained traction. This verification model can be used for any type of credential including passports, drivers licenses, and educational credentials like Open Badges. In fact, members of the Open Badges community helped to write the very first use cases for Verifiable Credentials back in 2018. This was because we knew then that if Open Badges data were to be trusted by apps, they not only needed to be readable by machines, they needed to be verifiable.

The Verifiable Credential model can provide security and privacy enhancements not yet available to Open Badges. This model gives learners persistent access to their badges increasing the longevity and resilience of credentials that were intended to promote and support lifelong learning. A more secure and universally supported verification model such as Verifiable Credentials enables Open Badges to become the personal skills currency it was originally envisioned to be.

Concentric Sky (Badgr) has offered a proposal for a new version of Open Badges that explains what is needed to make it work within the Verifiable Credentials model. The use cases in the proposal provide scenarios where Open Badges are earned and then how they are exchanged. In one use case, a medical professional is able to start her new job sooner because her CME’s can be quickly verified:

Verifying Continuing Ed: Denise was offered a new job at a hospital as a physician assistant. Before starting, her continuing education training and license to practice needed to be verified. The last time she switched hospitals, the verification process took three weeks. This time, she was able to provide her badges to prove her training and license. Within minutes her credentials were verified and she was issued a new digital staff credential

In another use case, a career change is facilitated by using verifiable Open Badges to map skills to jobs:

Mapping Skills: Sid is shifting careers after many years working in construction. In his digital wallet he had several skill badges describing his mastery of several skills in construction but also in teamwork, communication, and organizational skills. Sid also had badges from some courses he’d taken in science and math over the last few years. After he uploaded the skill and course badges from his wallet to a career planning site, he was offered several opportunities to apply for work in software sales and cybersecurity.

Here’s an example of how an Open Badge as a Verifiable Credential (Open Badges 3.0) exchange could work:

A learner connects to a badge issuing platform with their digital wallet app on their phone or laptop. Once authenticated, the issuer provides the badge to the learner who puts it in their wallet. The badge data contains cryptographic proof that identifies the issuer and the learner. A job employment app asks for proof that the applicant has experience with a requirement needed for that role. The learner presents the job employment app with the badge using the digital wallet app. The job employment app can then verify
a. that the learner providing the badge is the recipient of that badge
b. that the issuer is the identity that issued the badge, and
c. that the badge data has not changed since it was issued. The verifier responds that the badge is authentic.

In comparison, here’s an Open Badges 2.0 flow:

A learner or an organization provides an issuer app with the learner’s email address The Issuer generates badge data that includes the email address as the recipient identity and sends the earner the badge (typically as a link to a web page) The earner can share the link on social media, or perhaps with a potential employer or a job application app. The badge is verified by either
a. a human looking at the web page where the badge is hosted or
b. the application attempts to retrieve the badge data from a url hosted by the issuer.

The Open Badges 2.0 example depends on the issuer hosting the data and relies on an email address for the learner. The Open Badges 3.0 example is self-contained and doesn’t require the issuer to continue to retain a web hosting provider in order for the credential to remain valid. Instead it uses cryptographic proof to authenticate the badge, the issuer, as well as the learner who earned it. With either example, the learner has a way to proudly share their achievement online but the Open Badges 3.0 method doesn’t rely on that online presence for verification. In fact, the original issuer may no longer exist, but the achievements can still be verified.

On Monday July 19, we’ll be reviewing the Open Badges 3.0 proposal and anyone is invited to join us to learn more. Here’s the meeting info:

Monday, July 19, 2021
Time: 8am PDT / 11am EDT / 4pm BST, 5pm CEST
Jitsi Web Conference: https://meet.w3c-ccg.org/education
US phone: tel:+1.602.932.2243;3

On Thursday, July 22, Concentric Sky will be presenting this proposal to the IMS Global Open Badges working group to seek the members’ go-ahead to move forward on the work to make Open Badges 3.0 a reality. Public comments may be submitted here.

Tuesday, 13. July 2021

Phil Windley's Technometria

Alternatives to the CompuServe of Things

Summary: The current model for connected things puts manufacturers inbetween people and their things. That model negatively affects personal freedom, privacy, and society. Alternate models can provide the same benefits of connected devices without the societal and personal costs. In Peloton Bricks Its Treadmills, Cory Doctorow discusses Peloton's response to a product recall on its

Summary: The current model for connected things puts manufacturers inbetween people and their things. That model negatively affects personal freedom, privacy, and society. Alternate models can provide the same benefits of connected devices without the societal and personal costs.

In Peloton Bricks Its Treadmills, Cory Doctorow discusses Peloton's response to a product recall on its treadmills. Part of the response was a firmware upgrade. Rather than issuing the firmware upgrade to all treadmills, Peloton "bricked" all the treadmills and then only updated the ones where the owner was paying a monthly subscription for Peloton's service.

When I talk about Internet of Things (IoT), I always make the point that the current architecture for IoT ensures that people are merely renting connected things, not owning them, despite paying hundreds, even thousands, of dollars upfront. Terms and conditions on accounts usually allow the manufacturer to close your account for any reason and without recourse. Since many products cannot function without their associated cloud service, this renders the device inoperable.

I wrote about this problem in 2014, describing the current architecture as the CompuServe of Things. I wrote:

If Fitbit decides to revoke my account, I will probably survive. But what if, in some future world, the root certificate authority of the identity documents I use for banking, shopping, travel, and a host of other things decides to revoke my identity for some reason? Or if my car stops running because Ford shuts off my account? People must have autonomy and be in control of the connected things in their life. There will be systems and services provided by others and they will, of necessity, be administered. But those administering authorities need not have control of people and their lives. We know how to solve this problem. Interoperability takes "intervening" out of "administrative authority."

The architecture of the CompuServe of Things looks like this:

CompuServe of Things Architecture (click to enlarge)

We're all familiar with it. Alice buys a new device, downloads the app for the device to her phone, sets up an account, and begins using the new thing. The app uses the account and the manufacturer-provided API to access data from the device and control it. Everything is inside the administrative control of the device manufacturer (indicated by the gray box).

There is an alternative model:

Internet of Things Architecture (click to enlarge)

In this model, the device and data about it are controlled by Alice, not the manufacturer. The device and an associated agent (pico) Alice uses to interact with it have a relationship with the manufacturer, but the manufacturer is no longer in control. Alice is in control of her device, the data is generates, and the agent that processes the data. Note that this doesn't mean Alice has to code or even manage all that. She can run her agent in an agency and the code in her agent is likely from the manufacturer. But it could be other code instead of or in addition to what she gets from the manufacturer. The point is that Alice can decide. A true Internet of Things is self-sovereign.

Can this model work? Yes! We proved the model works for a production connected car platform called Fuse in 2013-2014. Fuse had hundreds of customers and over 1000 devices in the field. I wrote many articles about the experience, it's architecture, and advantages on my blog.

Fuse was built with picos. Picos are the actor-model programming system that we've developed over the last 12 years to build IoT products that respect individual autonomy and privacy while still providing all the benefits we've come to expect from our connected devices. I'll write more about picos as a programming model for reactive systems soon. Here's some related reading on my blog:

Life-Like Anonymity and the Poison Web The Self-Sovereign Internet of Things Relationships in the Self-Sovereign Internet of Things

Our current model for connected devices is in conflict, not only with our ability to functions as autonomous individuals, but also our vision for a well-functioning society. We can do better and we must. Alternate architectures can give us all the benefits of connected devices without the specter of Big Tech intermediating every aspect of our lives.

Photo Credit: modem and phone from Bryan Alexander (CC BY 2.0)

Tags: identity ssi didcomm ssiot picos


Doc Searls Weblog

Speaking of character

It seems fitting that among old medical records I found this portrait of Doctor Dave, my comic persona on radio and in print back in North Carolina, forty-five years ago. The artist is Alex Funk, whose nickname at the time was Czuko (pronounced “Chuck-o”). Alex is an artist, techie and (now literally) old friend of […]

It seems fitting that among old medical records I found this portrait of Doctor Dave, my comic persona on radio and in print back in North Carolina, forty-five years ago. The artist is Alex Funk, whose nickname at the time was Czuko (pronounced “Chuck-o”). Alex is an artist, techie and (now literally) old friend of high excellence on all counts.

And, even though I no longer have much hair on my head, and appear to be in my second trimester, my wife and son just said “Oh yeah, that’s you” when I showed this to them. “Totally in character,” said my wife.

I guess so. As Dave says (and does!), I’m still diggin’.

In the spirit of that, I thought this would be worth sharing with the rest of ya’ll.

 

Monday, 12. July 2021

Jon Udell

Working With Intelligent Machines

In The Chess Master and the Computer, Garry Kasparov famously wrote: The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time. Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted … Continue reading Working With Intelligent Machines

In The Chess Master and the Computer, Garry Kasparov famously wrote:

The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time. Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.

The title of his subsequent TED talk sums it up nicely: Don’t fear intelligent machines. Work with them.

That advice resonates powerfully as I begin a second work week augmented by GitHub Copilot, a coding assistant based on OpenAI’s Generative Pre-trained Transformer (GPT-3). Here is Copilot’s tagline: “Your AI pair programmer: get suggestions for whole lines or entire functions right inside your editor.” If you’re not a programmer, a good analogy is Gmail’s offer of suggestions to finish sentences you’ve begun to type.

In mainstream news the dominant stories are about copyright (“Copilot doesn’t respect software licenses”), security (“Copilot leaks passwords”), and quality (“Copilot suggests wrong solutions”). Tech Twitter amplifies these and adds early hot takes about dramatic Copilot successes and flops. As I follow these stories, I’m thinking of another. GPT-3 is an intelligent machine. How can we apply Kasparov’s advice to work effectively with it?

Were I still a tech journalist I’d be among the first wave of hot takes. Now I spend most days in Visual Studio Code, the environment in which Copilot runs, working most recently on analytics software. I don’t need to produce hot takes, I can just leave Copilot running and reflect on notable outcomes.

Here was the first notable outcome. In the middle of writing some code I needed to call a library function that prints a date. In this case the language context was Python, but might as easily have been JavaScript or SQL or shell. Could I memorize the date-formatting functions for all these contexts? Actually, yes, I believe that’s doable and might even be beneficial. But that’s a topic for another day. Let’s stipulate that we can remember a lot more than we think we can. We’ll still need to look up many things, and doing a lookup is a context-switching operation that often disrupts flow.

In this example I would have needed a broad search to recall the name of the date-formatting function that’s available in Python: strftime. Then I’d have needed to search more narrowly to find the recipe for printing a date object in a format like Mon Jan 01. A good place for that search to land is https://strftime.org/, where Will McCutchen has helpfully summarized several dozen directives that govern the strftime function.

Here’s the statement I needed to write:

day = day.strftime('%a %d %b')

Here’s where the needed directives appear in the documentation:

To prime Copilot I began with a comment:

# format day as Mon Jun 15

Copilot suggested the exact strftime incantation I needed.

This is exactly the kind of example-driven assistance that I was hoping @githubcopilot would provide. Life's too short to remember, or even look up, strptime and strftime.

(It turns out that June 15 was a Tuesday, that doesn't matter, Mon Jun 15 was just an example.) pic.twitter.com/a1epnaZRF9

— Jon Udell (@judell) July 5, 2021

Now it’s not hard to find a page like Will’s, and once you get there it’s not hard to pick out the needed directives. But when you’re in the flow of writing a function, avoiding that context switch doesn’t only save time. There is an even more profound benefit: it conserves attention and preserves flow.

The screencast embedded in the above tweet gives you a feel for the dynamic interaction. When I get as far as # format day as M, Copilot suggests MMDDYYY even before I write Mon, then adjusts as I do that. This tight feedback loop helps me explore the kinds of natural examples I can use to prime the system for the lookup.

For this particular pattern I’m not yet getting the same magical result in JavaScript, SQL, or shell contexts, but I expect that’ll change as Copilot watches me and others try the same example and arrive at the analogous solutions in these other languages.

I’m reminded of Language evolution with del.icio.us, from 2005, in which I explored the dynamics of the web’s original social bookmarking system. To associate a bookmarked resource with a shared concept you’d assign it a tag broadly used for that concept. Of course the tags we use for a given concept often vary. Your choice of cinema or movie or film was a way to influence the set of resources associated with your tag, and thus encourage others to use the same tag in the same way.

That kind of linguistic evolution hasn’t yet happened at large scale. I hope Copilot will become an environment in which it can. Intentional use of examples is one way to follow Kasparov’s advice for working well with intelligent systems.

Here’s a contrived Copilot session that suggests what I mean. The result I am looking for is the list [1, 2, 3, 4, 5].

l1 = [1, 2, 3]
l2 = [3, 4, 5]
# merge the two lists
l3 = l1 + l2 # no: [1, 2, 3, 3, 5]
# combine as [1, 2, 3, 4, 5]
l3 = l1 + l2 # no: [1, 2, 3, 3, 5]
# deduplicate the two lists
l1 = list(set(l1)) # no: [1, 2, 3]
# uniquely combine the lists
l3 = list(set(l1) | set(l2)) # yes: [1, 2, 3, 4, 5]
# merge and deduplicate the lists
l3 = list(set(l1 + l2)) # yes: [1, 2, 3, 4, 5]

The last two Copilot suggestions are correct; the final (and simplest) one would be my choice. If I contribute that choice to a public GitHub repository am I voting to reinforce an outcome that’s already popular? If I instead use the second comment (combine as [1, 2, 3, 4, 5]) am I instead voting for a show-by-example approach (like Mon Jun 15) that isn’t yet as popular in this case but might become so? It’s hard for me — and likely even for Copilot itself — to know exactly how Copilot works. That’s going to be part of the challenge of working well with intelligent machines. Still, I hope for (and mostly expect) a fruitful partnership in which our descriptions of intent will influence the mechanical synthesis even as it influences our descriptions.

Saturday, 10. July 2021

Vishal Gupta

Why DIRO technology is essential for crypto-currencies to go mainstream?

Core issue with crypto / stable-coins The biggest unsolved issue with crypto is its abuse as facilitator of ransomeware / extortion — causing continuous tightening of regulations. Pseudo-anonymous money laundering that causes traceable risks to Banks (unlike cash which is completely anonymous) — causing “de-risking” and therefore resulting in extreme and continuous friction for crypto to
Core issue with crypto / stable-coins

The biggest unsolved issue with crypto is

its abuse as facilitator of ransomeware / extortion — causing continuous tightening of regulations. Pseudo-anonymous money laundering that causes traceable risks to Banks (unlike cash which is completely anonymous) — causing “de-risking” and therefore resulting in extreme and continuous friction for crypto to go mainstream.

Due to the irreversible nature of crypto-currencies together with pseudo-anonymous traceability of dirty money, banks face unlimited risk in serving as on-ramps or off-ramps to crypto.

US-banking grade KYC at a global level is not possible with current technology Identity document can be tampered or stolen — any document can be photoshopped. There is no global database of “photos/biometric” and therefore it is possible to create fake/stolen identities easily. Impersonation checks don’t exist — There is a 15% error rate in verifying peoples faces with hi-resolution photos in passports by trained immigration professionals in a physical setting. Matching a digital photo with scanned ID by untrained staff or even AI is just no possible. Address checks don’t exist — There is no global method to verify current address of a given person. Any utility bill can be photoshopped or stolen.

Current Gold standard for digital on-boarding is bank verification by linking your account or verifying test-deposits of few cents in your bank but it only works domestically.

The issue is that this kind of linking of accounts is not possible internationally over a swift network. The alternative solutions like Plaid do not give global coverage and only work in core few developed countries.

In current scenario— Banks in most countries like switzerland, singapore, UK etc require physical visits or otherwise extremely cumbersome Notary / appostille of documents to even open non-crypto accounts.

DIRO is a fundamental new technology

that eliminates these bottlenecks not just for crypto-currencies but horizontally for many traditional industries.

Makes Global impersonation check possible — it is the first breakthrough technology that not only turns each and every institutional / government/ corporate website into an identity authenticator. This new technology extends the Gold standard (like Plaid) for US-banking grade identity verification at a global scale. Makes Global address check possible — can verify any current utility bill from any country, any language, any company be it Telephone, Gas, Electricity, TV, Internet, Municipal taxes or even 100% global coverage — without the need for any third party cooperation/ APIs or agreements. 100% guaranteed — DIRO leverages the time tested and battle tested SSL/TLS technology and blockchain to generate mathematically provable and court-admissible evidence. The documents produced using DIRO are even stronger than physical originals.

Privacy and data protection — The documents produced by DIRO are not only reportable and audit-able by third-parties and regulators but also provide decentralised verification on blockchain (without the typical “phone-home privacy problem”). The access to global private data behind login and passwords are accessed by the USER under is GDPR right to access his own data while automatically providing consent for verification to proceed.

Friday, 09. July 2021

Bill Wendel's Real Estate Cafe

White House & DOJ target #RECartel, intersection of #BigTech & #BigRE

#AntiTrustRE: Bad day for traditional real estate business models? First, the WSJ published a hard-hitting opinion, Warning to the Real-Estate Cartel (#RECartel) https://bit.ly/WarningRECartel_WSJ Then the… The post White House & DOJ target #RECartel, intersection of #BigTech & #BigRE first appeared on Real Estate Cafe.

#AntiTrustRE: Bad day for traditional real estate business models? First, the WSJ published a hard-hitting opinion, Warning to the Real-Estate Cartel (#RECartel) https://bit.ly/WarningRECartel_WSJ Then the…

The post White House & DOJ target #RECartel, intersection of #BigTech & #BigRE first appeared on Real Estate Cafe.

Thursday, 08. July 2021

MyDigitalFootprint

Choice, decision making and judgment; is your relationship constructive or destructive?

What is NEW in this article about decision making? The new part explains the relationship between choices, decisions and judgement and how our questions indicate if our relationship is curious and constructive OR linear, framed, and destructive.  This article is part of a masterclass I have been creating on how we, as directors and those in leadership, can improve our choices, decisions and
What is NEW in this article about decision making?

The new part explains the relationship between choices, decisions and judgement and how our questions indicate if our relationship is curious and constructive OR linear, framed, and destructive.  This article is part of a masterclass I have been creating on how we, as directors and those in leadership, can improve our choices, decisions and judgements using data and be better ancestors. 

This article is not another self-help or “use this framework to improve decision making”; it is for the curious and those who ask questions on their journey. The refelction at the end should be "How does this article affect our views on the automation of decision making and the use of AI?"

Why is this an important topic?

Our individual and unique view of the world comprises layers of constructs created by our personality, biases, preferences, facts, and ideas learnt from past experiences.  These constructs are better known as “mental models”, which frame how we individually make sense of or align to the world. We continually develop sense-making frameworks, models, and maps in our brains that frame what we perceive as reality, thus affecting our behaviour and, consequently, our Choices, decisions, and Judgments (CDJ).

Whilst we don’t like it, our mental model frames how we see the world.  I love how Cassie Kozyrkov (chief decision-maker at Google) describes in this article using the toss of a coin to determine how you see the world.  Statistics predict one aspect of chance but not how you perceive the results when the coin has landed, and the outcome is known but not to you. My dad taught me to toss the coins a few more times until I got the result I wanted. It was a reinforcement model that I had made the right decision from the possible two choices.  I would also suggest following and reading Lisa Feldman Barrett work, especially her new book “seven and a half lessons about the brain.” She says that we must grasp that everything we perceive as reality is, in fact, constructed from fragments, and we have no idea what reality really is.  (yes, this is reinforcement bias - I am quoting things to frame you that align to my model)

In our digital era, new uncertainty, quantum risk, and more ambiguity constantly challenge our mental models and how we accommodate and make sense.  The volume of noise and signals coming into our brains is high, but we filter and convert it all into something that has meaning to each of us according to our own unique mental model.  We all reach different interpretations about what is happening, which creates frustration, confusion and misalignment. We use questions to clarify and check our understanding, assumptions and quality of information. Slight differences always remain; the unsaid, mislead, guided, incentivised, and overconfidence, unfortunately, takes us from the simple to correct misalignments; to tension, conflict and entrenchment.

This topic matters as directors are mandated to make decisions, but we find we are operating with misalignments, tensions, compromises, and outright conflict.  This happens as the individuals sitting at the same table have agency (and their own mental models and incentives). We are unsure if we have the right choices or are clear about our judgment’s unintended consequences or long-term impact. We have to talk about it. 

We should unpack our relationship with choice, decision and judgement as mental models, hierarchy and rules constrain us. This article is not about “how to ask better questions”, nor if you should (as some questions we don’t want the answer to), but how to determine if you, your team or your company has a constructive or destructive relationship with CDJ. 

---

When talking about CDJ, you would imagine that in 2021 that starting from the definitions should help, but it does not, as there is a recursive loop of using one definition to define the other words, which define themselves.  Amazingly there are professional bodies for decision making and judgement; alas, even these cannot agree on how to define or clearly demark between intent and actions. Our base problem is that:  everything framed with a maths or data mind: is a decision. Everything is a choice when framed by a psychologist or social scientist. To someone who has authority and responsibility or plays with complexity, everything looks like a judgment.  Confusingly everything is an opinion to a judge!


Everything framed with a maths or data mind: is a decision. Everything is a choice when framed by a psychologist or social scientist. To someone who has authority and responsibility or plays with complexity, everything looks like a judgment.  Confusingly everything is an opinion to a judge!


Here are the definitions from the Collins and Oxford dictionaries  

Choice  [countable noun] If there is a choice of things, there are several of them, and you can choose the one you want.  [countable noun] Your choice is someone or something that you choose from a range of things. Collins  OR  [countable] an act of choosing between two or more possibilities; something that you can choose  [uncountable] the right to choose; the possibility of choosing Oxford Dictionary

Decisions  [countable noun] When you make a decision, you choose what should be done or which is the best of various possible actions.  [uncountable noun] Decision is the act of deciding something or the need to decide something Collins  OR  [countable] a choice or judgement that you make after thinking and talking about what is the best thing to do  [uncountable] the process of deciding something Oxford Dictionary

Judgment [uncountable noun] is the ability to make sensible guesses about a situation or sensible decisions about what to do.  [variable noun] A judgment is an opinion that you have or express after thinking carefully about something  Collins  OR  [uncountable] the ability to make sensible decisions after carefully considering the best thing to do; [countable, uncountable] an opinion that you form about something after thinking about it carefully; the act of making this opinion known to others Oxford Dictionary



Therefore, a judgment is the ability to make a sensible decision about a choice, which requires judgment about which choices to pick. As Yoda would say, “wise decision you make, stupid choices your judgement however selected.”

The timeless challenges of Choices, Decisions and Judgment (CDJ)

“I change my mind as the data changes” is a modern digital age sentiment from the economist John Maynard Keynes who is quoted to have said, "When the facts change, I change my mind." It was likely adapted from an early human bias where leadership in war and battles refused to change their mind even when the facts were in, and they had been proven wrong. 

Choices, decisions and judgement are not difficult if you relinquish your values, ethics and blindly follow the incentives or can fully appreciate the impact of your actions. Still, we know it is just not that simple. We have heuristics to remove choice rather than create more and ways to find more data and facts to become more informed, but without knowing if the new data is helpful or not.  Ultimately, at some point, you have to make a choice, decision or judgement. 

The diagram below represents some of the timeless challenges of CDJ, which is a balance between what we know and don’t know.  We have the experience that creates the ghosts of the past, flighting the voices of the present as we try to decide what the spirits of the future hold, whilst being held accountable for the decisions we make. 

The point here is that it is always possible to find a reason why to act or not to act and the timeless challenge remains that there is no perfect choice, decision or judgment.  However, over time and because we have choices we can make decisions that improve our judgement which means we can find and select the right choices.  This is a constructive relationship between CDJ.  A destructive relationship would be to not like the choices we face, procrastinate in the hope a better choice occurs or maybe lose a choice where the decision is made for you. You don’t improve your judgment and so you cannot determine if the next choice is any better - it is linear.  

Is the CDJ relationship about framing?

From the definition section at the beginning of this article, it was evident that there is a high degree of overlap and dependency between choice, decision and judgment.  As highlighted in the previous section it can become complex very quickly but our brain (mental models) demand patterns to make sense of it and so we tend to come back to simple linear narratives which do not do them (choice, decisions, judgment) justice. What we have realised but tend not to verbalise is that none of our (mental) models or frameworks work in all cases and indeed all models about choice, decision and judgment fail.  This is why there is always a new book on the topic with new self-help content that you have not seen before, and we cling to a hope that the next model will work better - alas they don’t. I am aware of over 100 decision support models and I expect I have not really scratched the surface.   An example below puts choice, decision and judgment on a continuum between your own north star and that for a shared tribe, community or society.  It does not really work.  White flag time. 

What we have realised but tend not to verbalise is that none of our (mental) models or frameworks work in all cases and indeed all models about choice, decision and judgment fail.


#Lockdowns have enabled many social experiments.  One has been about choice and buying behaviour.  As we moved our shopping for food online and therefore missed the in store sale specials, end of gondea, carefully placed items next to each other, those large piled up offers in the entrance and the sweets at the exit, our patterns have changed as choice became limited.  We became creatures of habit, buying largely the same items, which reduce in variance over time.  (This also highlights that the UI and UX for shopping sucks.)  The decision to shop was removed from us, and our choices fell, but the variety was still there.  So much for the Web creating perfect information or the algorithm knowing what you would want. 

As an alternative, we can determine that CDJ will change depending on the context right now and the perception those facing the issue have.  The figure below highlights three examples but what we rapidly conclude is that complexity in CDJ is generated by speed, volume, consequences, data, sense of enquiry, situation and our own mental models. Every known variable adds layers.


Where are we up to and where next?

So far, we have explored that there is a relationship between choices, decisions and judgements. We, however, tend to focus on one of them (decision making) as it looks the most difficult, and the others are just linkages. However, this preference for “decision making” fails to understand if we are using the right word to describe the action we are taking. There is no doubt that the word decision is preferable as it is perceived as more powerful and important than choice or judgment. The relationship between them is causal, complex and relational, and we are framed, educated and incentivised to the simple linear view and a single narrative. The reality is that judgement helps frame our choices which we decide on that improve our judgment skills — they are not linear; they are circular. They are circular as we all have agency. 

 The reality is that judgement helps frame our choices, which we decide on, that improve our judgment skills

We know the linear models are broken as there is no universal tool to help with choice, decision and judgement.  The next part is to explain the relationship we have between choices, decisions and judgement and how our questions indicate if it is constructive and curious OR linear, framed and destructive.

We tend to focus on the decision and choice axis. If we can eliminate bad choices we can improve decisions, but ignore the fact that it is judgment skills that help find the right choice. Procrastination as a decision tool is framed that time will remove choice and so a decision becomes easier, either because a choice has been removed or more data supports one option. More data does not make decision-making easier; nor does it guarantee to make it better.  

Worthy of note is that academic work about “decision-making” will always seek to create the most complex solution because academics are incentivised to use the most advanced and new thinking (publication, referencing, research and reputation). Sometimes tossing a coin is the perfect solution.   The more we see the complex, the less we will accept the simple.  In a linear world where we view choice, decision and judgment as a progression line, where we are given choices when young and make a judgement when we are old and wise ignore that we learn and need to exercise all three all the time to become more proficient at decision making.  Decision making is not a task or an end game. 

As a thought experiment: if you had all the data in the world and all the compute power possible to run a perfect forecast model, would you need to make further choices, decisions or judgments?  To reach your view I assume you have decided to either give each human full agency or you have taken our individual agency away.  Now in my definition of data, which might be different to yours, how we all behave with our agency is just data, which opens up a can of worms for each of us. Can you have full agency (freewill) and it all still be modelled? What do we mean about behavioural modelling, agency, data and even more precisely all data?

Getting to one data set, one tool, one choice hides layers of complexity which looks good for one better decision but is unlikely to be a long term solution to improving choices, decisions and judgment.  You have/had choice, you make/made decisions, you exhibit/exercise judgment.  Judgement supports you in finding choices. 

How the questions we ask inform us! 

The questions you are currently asking will help inform you where you are in the cycle, indeed if you already have closed the loop and have a constructive cycle or a linear open loop distructure relationship with choice, decisions and judgment. 

The circular framing means depending on the stage we are in a process we will be asking different questions.  Given that a board has to deal with many situations all at different stages at every meeting, we should see all these questions asked at every meeting, just framed to different agenda items.  If there is no variation in questions,  surely it tells us something.  Indeed are we using our improvement at each stage to further improve the next outcome.  

The destructive “cycle” is not circular but is a linear model as it is fundamentally disconnected from a learning based idea using choice and judgement and focussed on Better Decision making. It depends on the idea that by reading a new book or going on a new decision making course that we will get better.  The iteration of improvement is an external influence.  Perhaps it is mentoring or coaching and they (mentors) keep you from closing the loop as they either don’t know or it would not support their business model. Indeed books, education and courses have no interest in you closing the loop and always making you believe in a new tool, method, process - it is their business model!

In the boardroom or in the senior management meetings is someone always asking the same questions?  Does the agenda mean that as a team individuals are unable to ask different questions or is it that others are influencing the agenda to keep it framed to the decisions and outcomes they want.  Why enable the asking for more data when the choice they want you to decide on and agree with is already the best supported case (railroaded).  Do you have observers or assessors who look at the questions you ask as a team and determine if you are asking the right questions? (something I am learning to do)  Can skills in choice, decision and judgment be determined by the questions asked in the meetings? (something I am exploring)

A key question I often come back to when thinking about choice, decisions and judgment is “What are we optimising for?” I find the model below a helpful guide to understand the framing.  In a linear model it would present moving from single choice in the lower left to complex judgment in the top right. In a learning cyclic model, choice, decision and judgement equally applies, however the lower left is a better learning and experience gaining context for mentoring, or succession planning, than the top right. 

Why does this matter, because right now we have increasing data along with more vulnerabilities, ambiguity, complexity and uncertainty. The volume of moving variables (change) and the rate of change breaks the linear model as you can never match the model at hand to the situation you face.  There is a need to move from linear ideas of choice, decisions and judgment to a circular one.  Linear thinking is the best model when there are limited options and there is stability.  We now have many options, increasing variables and more instability.  

As your company is owned by no-one (yes you read that right - follow this link if you need to read more on this, a company owns itself) and a company cannot think for itself and therefore we (the directors) have to do the thinking for it.  We are given in law the authority and responsibility to act on behalf of the company.  This is “fiduciary duty”.  It is the reason why directors need to move from a linear perspective of decision making to a circular improvement learning process of choice, decision making and judgement.

My proposal to help support better governance is that we request companies publish the questions asked in a board meeting. Not the answers, but definitely the questions.


Wednesday, 07. July 2021

MyDigitalFootprint

The railroad of (no) choice

In those first 100 days, it will become evident if you are about to be railroaded or if you have to present the choice without creating railroading for those data laggards. The latter being the job, the former being a problem. To be clear, railroaded means in this context: to force something to be officially approved or accepted without much discussion or thought. to fo

In those first 100 days, it will become evident if you are about to be railroaded or if you have to present the choice without creating railroading for those data laggards. The latter being the job, the former being a problem.

To be clear, railroaded means in this context:

to force something to be officially approved or accepted without much discussion or thought. to force someone into doing something quickly, usually without enough information.

As the CDO, are you about to find that the tracks have already been laid and that you on the train and it is going in one direction. You are now the figurehead of the new shinny data plan based on already accepted wisdom. Your hope, before starting the role, is that it is more analogous to a personal transport situation. This would be where you get to pick the fuel (food, combustible), vehicle (walk, run, bike, motorcycle, car, van, lorry, aeroplane, boat), the destination and the route. Using the analogy of the train on the tracks, the decision to create a data-led organisation, the ontology, the data we have and the value we will create has been made irrespective of what you come up with, or what the data might be able to tell you. The processes are in place, and your role is not to craft, invest, create or imagine but to follow the tracks of command and control. It is at that moment that you realise just how good the CEO is at sales.

In this case, your priorities are: How to find the right tools that align or match what you are faced with? And, how do you take a senior leadership team and fellow directors on a journey to discover that they can change the rails, change the vehicle and be open to changing how we see the journey from A to B? Indeed based on a pandemic, do we need to physically move at all?

When the “railroaded” is you and you are being asked to approve something when you have not had the time or opportunity to gather the data or understand the processes, what are the right words to create time to explore and not appear indecisive, trying to procrastinate or evasive?

what are the right words to create time to explore so as not to appear indecisive, trying to procrastinate or evasive?

Presenting an argument about why you need more time is likely to fail, as the defensive response will always be “we have already done the work but it is your decision — trust us” Asking questions that check that the choices and opinions are the right ones or to determine the consequences will equally draw you into an argument where “the facts are the facts,” you have to decide or do you not trust us?

One question that will give some breathing space is “For what are we optimising,” as explored below.

The two-axis on the model set the horizontal scale of the short and long term against the vertical axis how many variables, from one to many. In asking the question “for what are we optimising,” you are not presenting this framework, but use the question to understand what the “railroading” is trying to achieve; through understanding what the decision being expected is optimising for. With this knowledge, you may find it is easier to take the railroaded decision. If you can conclude they are optimising for many externalities and long term, you are immersed in a team that can cope and manage complexity- this is going to be fun. If this is short term and a single variable you know that incentives are the significant driver and slowly changing the incentives will create a better place for making more complex data decisions. One scenario is outside of this framing which is that they actually don’t know what they are optimising for and the decision at hand is just another reaction to previous poor railroaded decisions. Definitely a red flag moment.

Note to the CEO

Using railroading as an activity to discover an individuals capability and backbone is always likely to go wrong. In this data-driven complex world where both the data and the analysis is a long way from your immediate access, you may also feel you are being railroaded by the signals and noise. It is likely you will ask “is this recommendation optimising for xyz (the plan you have agreed to) and how can you show this is the optimal path without using the same data” Love the misattributed quote that Ronald Reagan stole from the Russians but so can we. Trust, but verify.


Here is a controversial startup carbon capture plan — anyone else in?

Humans are made up of 18% of the average person at 140 pounds, 62Kg, 10 stone: is Carbon. Deaths look like The question is how to securely airtight wrap bodies, respectively, in poor grade, unreusable recycled plastics and bury the person and plastic deep in an unused mine to long term capture the carbon from the plastic and our remains. Based on 54 million deaths per year, the&nb

Humans are made up of

18% of the average person at 140 pounds, 62Kg, 10 stone: is Carbon.

Deaths look like

The question is how to securely airtight wrap bodies, respectively, in poor grade, unreusable recycled plastics and bury the person and plastic deep in an unused mine to long term capture the carbon from the plastic and our remains.

Based on 54 million deaths per year, the opportunity is to remove 62kg * 18% = 602 million KG or 650,000 tons. 100% is highly unlikely due to religion, personal preference and the remoteness of many deaths. However, is 30% of this achievable? — probably.

Focussing just on those countries who cremate through legislative changes and adoption over time of using your last will as an act of kindness to future generations. There is an issue about individual values for memorials and “ashes” — however, most are now scattered which means there is no specific place. We need a better way to remember those who created us.

Does this make a dent? Well yes.

Storage facilities.

We have mined across all counties which have not been backfilled. The biggest issue is one of logistics, contamination and rats and ensuring so we don’t release more carbon than we capture by running this. It is about being respectful and finding a way that those who care about future generations (50% of the population) and enabling them to be instrumental in using their body to take carbon out. Can it become part of a Living Will?

Is this something you would consider — being wrapped or want to get involved in setting up?

Saturday, 03. July 2021

Aaron Parecki

How to export your complete Foursquare checkin history

Today I finished up a tool that you can use to export your complete history from Foursquare and publish the checkins to your website!

Today I finished up a tool that you can use to export your complete history from Foursquare and publish the checkins to your website!

In 2017, I created OwnYourSwarm to export my future Swarm checkins to my website in real-time. It's been working great, and it's meant that I have had a complete archive of all my checkins on my website ever since then, so I don't have to worry if Foursquare disappears one day. However, I never got around to creating a way to export my past checkins, so I was always missing my checkin history from 2009-2016.

I had been considering building this as a feature of OwnYourSwarm, but realized that it would end up taking a lot of additional effort to make it work well as a web service, in addition to dealing with possible rate limit issues with the Foursquare API. So instead, this is published as a downloadable script you can run on your own computer. This also means you have a bit more flexibility in how you can use it, as well as being able to customize it more if you choose.

You can download the code from GitHub here:

https://github.com/aaronpk/Swarm-Checkins-Import

The readme has installation and usage instructions, so I'll refrain from repeating all that in this post. Make sure to check out the step by step tutorial in the readme if you want to use this with your own account.

The process is broken up into a couple steps.

First, it downloads your entire Foursquare checkin history to JSON files on your computer Second, it downloads all the photos from your checkins Third, it publishes each checkin to your website via Micropub

If your website doesn't support checkins via Micropub, or if you don't want your checkins on your website at all, you can just skip that step entirely, and instead you'll have a complete export of your data locally.

The JSON files contain the raw API response from Foursquare, so you can do what you want with that as well, such as turning it into your own HTML archive using your own custom tools if you want.

The one issue that I don't have a good solution for is handling renamed venues. Unfortunately the API returns the current name for old checkins, so if a venue is renamed, your old checkins will not reflect the name at the time of the checkin. This is particularly strange for businesses that have gone through an acquisition or rebranding, since for example all my old checkins in Green Dragon are now labeled as Rogue Eastside Brewery. As far as I can tell there isn't a good way to handle this, so I may have to go back and manually edit the posts on my website for the venues I know have been renamed.

I hope this is useful to people! I will be sleeping a little easier now knowing that my old checkin history is safely archived on my website now!

Thursday, 01. July 2021

Nader Helmy

Rendering credentials in a human-friendly way

At MATTR we’re always dreaming up ways to make decentralized identity and verifiable credentials easy and intuitive to use for as many people as possible. From the start, it’s been a core part of our mission to make sure that end users understand the implications of decentralized identity and the control it affords them over their data and their privacy. This model offers users greater sovereignt

At MATTR we’re always dreaming up ways to make decentralized identity and verifiable credentials easy and intuitive to use for as many people as possible.

From the start, it’s been a core part of our mission to make sure that end users understand the implications of decentralized identity and the control it affords them over their data and their privacy. This model offers users greater sovereignty over their own information by empowering individuals as both the holder and subject of information that pertains to them. Users are able to exercise their new role in this ecosystem by utilizing a new class of software known as digital wallets.

We first released our Mobile Wallet for smartphones in June 2020, with a simple user interface to allow people to interact with and receive credentials from issuers as well as present credentials to relying parties. In the interim, we have developed a number of improvements and features to the Mobile Wallet to support advanced capabilities such as:

Authenticating to Identity Providers over OpenID Connect to receive credentials via OIDC Bridge Deriving privacy-preserving selective disclosure presentations from credentials using BBS+ signatures Establishing a secure DID messaging inbox for users to receive encrypted messages and push notifications

These changes have not only made the wallet more functional; they’ve also evolved to better protect users’ best interests — giving them privacy-by-design and surfacing the information and context that they need to confidently make decisions underpinned by the security of decentralized identity.

Timeline of MATTR wallet development

This journey has led us to realize the importance of creating a wallet experience that places users front and center. As these systems create more opportunity for user-driven consent and identity management, they’ve simultaneously created a demand for a wallet that can not only perform the technical operations required, but do so in a user-friendly way that surfaces the information that truly matters to people. Our latest feature release to the MATTR Mobile Wallet is a step in this direction.

With Human-Friendly Credentials, we have added the capability to render different kinds of credentials uniquely in the wallet interface according to the information they contain. Until now, the end user experience for verifiable credentials has been largely consistent across different categories of credentials and issuers. In other words, a credential containing medical data from your doctor looks exactly the same as an education credential from your university or a concert ticket from a music venue: they all appear to the user as a long list of claims.

In this release we change that. Thanks to the semantic information encoded in verifiable credentials, the wallet is now able to understand and interpret certain kinds of credentials to render them to the user in a way that makes the data easier to understand.

JSON-LD verifiable credentials have the ability to support common data vocabularies and schemas which are published on the web. For example, if a credential contains a claim describing the name of an individual, the claim can be defined via reference to an existing data vocabulary found here: https://schema.org/Person

Human-Friendly Credentials allow the wallet to start intelligently displaying known credential types and data types. This shows up in a variety of different ways in a user’s dataset.

For example, this update formats address fields to make them more readable; formats names and proper nouns where possible; makes URLs, telephone numbers and email addresses clickable; highlights images and icons for better trust and brand signaling; and creates basic rules for language localization that adjust to a user’s device settings. This logic allows the wallet to create a kind of information hierarchy that starts to draw out the important aspects of data included in a credential, so users can make trust-based decisions using this information.

These rules are applied to any kind of credential the wallet encounters. Whilst all of this is incredibly helpful for users, we have gone a step further: displaying entire credentials in a completely unique UI according to their type. The ‘type’ property of a credential expresses what kind of information is in the credential — is it a degree, a medical record, a utility bill? The usage of common credential types across different implementations and ecosystems is necessary for semantic interoperability on the broader scale. The wallet should be able to recognize these common credential types and display them to the user in a friendly way.

In this update, we have added custom rendering for both Personal Identity Credentials as well as Education Courses. These are common types we see occurring naturally across the decentralized identity landscape, and now the MATTR Mobile Wallet is able to recognize and display them properly.

Personal Identity Credentials and Education Course credentials

An important note to consider is that Human-Friendly Credentials only work for known data and credential types. As the ecosystem matures, we expect to add more data types and credential types in the future to support an even broader set of human-related processes and actions. In a general sense, we will also continue to iterate on our digital wallet offerings to provide a greater degree of flexibility and control for end users. We believe it’s a vital component to a healthy digital trust ecosystem.

To check out these changes yourself, download the latest version of our Mobile Wallet to get started.

For more information on Human-Friendly Credentials, check out our tutorials and video content on MATTR Learn.

For everything else related to MATTR, visit https://mattr.global or follow @MattrGlobal on Twitter.

Rendering credentials in a human-friendly way was originally published in MATTR on Medium, where people are continuing the conversation by highlighting and responding to this story.


Moxy Tongue

Recursive Signatory (American Rights)

A Sovereign Nation is only capable of two structural models of participation; 1. Database model - data plantation based on possession of ID assets structured as liabilities  2. Self-Sovereign model - root authority at edges, admin verification mechanism at center. In order for #2 to live in the world, a recursive signatory from one generation to the next authenticating the structura
A Sovereign Nation is only capable of two structural models of participation;
1. Database model - data plantation based on possession of ID assets structured as liabilities 
2. Self-Sovereign model - root authority at edges, admin verification mechanism at center.
In order for #2 to live in the world, a recursive signatory from one generation to the next authenticating the structural model of participation and Rights is required. 
In America, John Hancock serves as our muse when considering the structure of Sovereign authority and the ID/signatory of participation within civil Society. The source of authority demonstrated by John Hancock and his peers in signing the "Declaration of Independence" had a personal origin. These people all put their personal lives, security, fortunes and reputations on the line, literally, and there was no authentication system in place when they did so. In essence, they used a personal relationship network to validate their own personal Sovereign source authority to declare anything meaningful in nascent Sovereign Terms.
You, as an American citizen, are no less than these people. Unless, of course, you Agree to participate in an ID/signatory based system of Rights without holding the system accountable to enforce your own participatory authority.
American Rights do not resolve from a database or State Trust storing birth certificates. American Rights are for, by, of people, not birth certificates. Of course, in administrative practice, try exercising your personal Rights without these administrative documents, including passports, driver's licenses, social security numbers, etc... which all originate and resolve to an administrative source of authority.
If you are provisioned authority to act as an American citizen, are you actually an American? Structure yields results in such concerns, and every person who has ever started a corporation and taken responsibility to protect the "corporate veil" from being compromised understands the importance of structural integrity. Unfortunately, the average American citizen does not, and the process of demonstrating America's model of authority is not something that is taught in public schools, funded by American tax payers.
American authority originates in people, Individuals all. This is not an optional requirement. Any other structural model is literally "un-American", and out of step with Constitutional integrity. Do law schools understand this?
A recursive signatory serves as the foundation of American ID within all Government-supported administration methods, such that the originating act of "Declaring Independence" is conveyed to every person, every Individual citizen, in the same manner as the founders demonstrated personally.
The founders of America did not demonstrate one model of American Sovereignty, and then convey a data plantation to everyone else to comply with. Instead, one generation after the next is required to protect the very Institution derived "of, by, for" people, Individuals all, by way of the structural process of inaugurating authentic participation.
Americans give their personal authority to their Governance. Americans do not receive their authority from their Government. The American people are not the dog of their government, because structurally, the Government is guaranteed to operate as the best friend of the people, Individuals all. This structural process, putting the authority of people before the authority of a bureaucratic Institution is the literal structural integrity differentiating THE UNITED STATES OF AMERICA from every other Nation on Earth.
Generation after generation since America's founding has had their personal Sovereign source authority squandered and mis-represented by an "error of omission" from the structural process of forming an identity with recursive signatory integrity in the administration of American Rights.
This order of operations failure, putting a false model of administrative precedence within a database, is of utmost importance to how American Rights are administered. Without an accurate administrative precedence belonging to people, Individuals all, the Government becomes the keeper of people, rather than people being the keeper of their Governance.
For some reason, this is not obvious. For some reason, lawyers get this wrong constantly. The concept of a recursive signatory tied to Rights administration exceeds the intellectual grasp of legal managers living on the data plantation they have derived their administrative methods from. This must be repaired, for a Government derived "of, by, for" people, Individuals All.. can not exist otherwise.
America can not exist in practice without a recursive signatory event tied to Rights administration for every American citizen, giving form and function to the self-Sovereign method of Rights administration that gave life to America, and sustains it to this day.

Tuesday, 29. June 2021

Moxy Tongue

All Along Watchtower



Identity Woman

Special Topic IIW 1/2 Day Virtual Events – UX July 22nd and Business Aug 4th

I’m super excited to announce that we have two different special topic IIWs coming up. If you interest or practice focuses on either of these we invite you to join us!!! User-Experience and SSI is coming up Thursday July 22nd. The Business of SSI is coming up Thursday August 4th. From the EventBrite about the […] The post Special Topic IIW 1/2 Day Virtual Events – UX July 22nd and Business Aug 4

I’m super excited to announce that we have two different special topic IIWs coming up. If you interest or practice focuses on either of these we invite you to join us!!! User-Experience and SSI is coming up Thursday July 22nd. The Business of SSI is coming up Thursday August 4th. From the EventBrite about the […]

The post Special Topic IIW 1/2 Day Virtual Events – UX July 22nd and Business Aug 4th appeared first on Identity Woman.

Monday, 28. June 2021

DustyCloud Brainstorms

Hello, I'm Chris Lemmer-Webber, and I'm nonbinary trans-femme

I recently came out as nonbinary trans-femme. That's a picture of me on the left, with my spouse Morgan Lemmer-Webber on the right. In a sense, not much has changed, and so much has changed. I've dropped the "-topher" from my name, and given the common tendency to apply gender …

I recently came out as nonbinary trans-femme. That's a picture of me on the left, with my spouse Morgan Lemmer-Webber on the right.

In a sense, not much has changed, and so much has changed. I've dropped the "-topher" from my name, and given the common tendency to apply gender to pronouns in English, please either use nonbinary pronouns or feminine pronouns to apply to me. Other changes are happening as I wander through this space, from appearance to other things. (Probably the biggest change is finally achieving something resembling self-acceptance, however.)

If you want to know more, Morgan and I did a podcast episode which explains more from my present standing, and also explains Morgan's experiences with being demisexual, which not many people know about! (Morgan has been incredible through this whole process, by the way.)

But things may change further. Maybe a year from now those changes may be even more drastic, or maybe not. We'll see. I am wandering, and I don't know where I will land, but it won't be back to where I was.

At any rate, I've spent much of my life not being able to stand myself for how I look and feel. For most of my life, I have not been able to look at myself in a mirror for more than a second or two due to the revulsion I felt at the person I saw staring back at me. The last few weeks have been a shift change for me in that regard... it's a very new experience to feel so happy with myself.

I'm only at the beginning of this journey. I'd appreciate your support... people have been incredibly kind to me by and large so far but like everyone who goes through a process like this, it's very hard in those experiences where people aren't. Thank you to everyone who has been there for me so far.


Phil Windley's Technometria

Building an SSI Ecosystem: Health Passes and the Design of an Ecosystem of Ecosystems

Summary: Ever since the Covid pandemic started in 2020, various groups have seen verifiable credentials as a means for providing a secure, privacy-respecting system for health and travel data sharing. This post explores the ecosystem of ecosystems that is emerging as hundreds of organizations around the world rise to the challenge of implementing a globally interoperable system that also resp

Summary: Ever since the Covid pandemic started in 2020, various groups have seen verifiable credentials as a means for providing a secure, privacy-respecting system for health and travel data sharing. This post explores the ecosystem of ecosystems that is emerging as hundreds of organizations around the world rise to the challenge of implementing a globally interoperable system that also respects individual choice and privacy.

In The Politics of Vaccination Passports, I wrote about the controversy surrounding proposals to require people to show proof of vaccination to travel–specifically fly. The worry is that once we get used to presenting a heath credential to travel, for example, it could quickly spread. Presenting an ID of some kind could become the default–with bars, restaurants, churches, stores, and every other public place saying "papers, please!" before allowing entry.

My personal take is that while I'd rather avoid that scenario, it is likely inevitable. And if that's true, I'd love to have it designed in a way that respects individual choice and personal privacy as much as possible. This is a tall order because getting the tech right is only the first step on a long road. The air travel industry is global, gargantuan, and incredibly decentralized. Building an interoperable system of travel passes requires not just an ecosystem, but an ecosystem of ecosystems.

Suppose Alice lives in Naruba and has been vaccinated against Covid. The local hospital where she was vaccinated has issued a credential to Alice with the data about her doses, the dates, vaccine batch numbers, and so on. Alice is traveling on business to the Emirate of Kunami which requires proof of vaccination. To fly, Alice must present a health pass to the gate agent who works for Kunami Air in the Soji airport. How does Kunami Air know they can trust a credential issued by a hospital they've never heard located in another country?

Building a global, interoperable health and travel pass network requires both technology and governance. This sort of situation isn't new. In fact, if we replaced "proof of vaccination" with "airline ticket" in the preceding scenario, we wouldn't think anything of it. Different companies from different countries have been figuring out how to interoperate and trust the results for decades, maybe centuries.

But doing that digitally, and quickly, is a big job with lots of moving parts. Below I take a look at a few of those parts and how they come together to solve the problem.

Good Health Pass Collaborative

The Good Health Pass Collaborative (GHPC) is an "open, inclusive, cross-sector initiative" of dozens of companies from the travel, health, and technology industries that is defining the blueprint for health and travel passes that are privacy-protecting, user-controlled, equitable, globally interoperable, and universally-accepted by international travel.

GHPC is very specific about what a "pass" is:

A credential to which data minimization and anti-correlation have been applied and any relevant travel data has been added so it includes only what a verifier needs to make a trust decision in a specific context (such as boarding a plane).

A credential could have lots of health and travel data. The pass contains just the data needed for a specific context.

GHPC published a set of ten principles in February to guide their work. These principles lay out a path that is consistent with self-sovereignty, respects individual autonomy, and promotes good privacy practices.

In June, GHPC published a blueprint that provides recommendations for building a good health pass. The challenge of meeting the principles, while exchanging health data across different industry sectors can be overwhelming. The blueprint provides specific guidance on how to meet this challenge without sacrificing the principles that make a health pass "good." The recommendations include designs for trust registries and frameworks, data and protocol standards, and other components for the global interoperability of COVID certificate ecosystems.

These efforts say what a good health pass is and give guidance for creating a credential ecosystem, but they don't create any such ecosystems. That's a job for other organizations.

The Global COVID Certificate Network

The GHCP Blueprint provides recommendations about trust registries and frameworks, data and protocol standards, and other requirements to create global, interoperability ecosystems for COVID certificates. The Linux Foundation's Global COVID Certificate Network (GCCN) "operationalizes and adapts the GHCP Blueprint recommendations for governments and industry alliances who are working to reopen borders." You can think of GCCN as an instantiation of the GHPC Blueprint.

Going back to our traveler, Alice, we can imagine that Naruba and Kunami both have national health systems that can follow the blueprint from GHCP. When Alice uses her health pass inside Naruba, it is reasonable to expect that Narubian businesses will know what a real Narubian health pass looks like and whether to trust it or not. To make this possible, the Narubian health ministry would determine what data a legitimate pass contains (e.g. its schema) and what forms it takes such as the paper design and digital format. The ministry could also determine who in Naruba can issue these health passes and even set up a registry so others in Naruba can find out as well.

This kind of one-size-fits-all, single-source solution can solve the local problem, but when Alice is interacting with people and organizations outside Naruba, the problem is much harder:

Naruba and Kunami may have adopted different technologies and schema for the credential representing the health pass. Random organizations (and even people) in Kunami need to be able to establish the fidelity of the credential. Specifically, they want to know that it was issued to the person presenting it, hasn't been tampered with, and hasn't been revoked. In addition, these same entities need to be able to establish the provenance of the credential, specifically that it was issued by a legitimate organization who is authorized to attest to the holder's vaccination status.

This is where GCCN comes in. GCCN has three components:

a trust registry network a certificate implementation toolkit a set of recommended vendors

The trust registry network and its associated protocol not only helps Naruba and Kunami each establish their own registries of authorized health pass credential issuers, but also enables a directory of registries, so an organization in Kunami can reliably find the registry in Naruba and discover if the issuer of Alice's credential is in it.

The toolkit provides several important components for a working ecosystem:

a template for a governance framework that governments and industry alliances can use to make their own policies. schema definitions and minimum data sets for the credentials. technical specifications for the software components needed to issue, hold, and verify credentials. implementation guides and open source reference implementations. guidance for creating the governance framework and technical implementation.

The vendor network provides a commercial ecosystem to which governments and industry associations can turn for support. The vendor network provides a set of organizations who have competence in building credential ecosystems. Over 25 separate companies and organizations support GCCN.

With all this, GCCN doesn't actually build the ecosystems. That falls to organizations who use GCCN to instantiate the framework provided by GHPC.

Building the Ecosystem

One of those organizations is Cardea. Cardea is a fully open-source and field-tested verifiable digital credential that meets the major requirements of the Good Health Pass Blueprint. Cardea was developed by Indicio and is now a community-led project at Linux Foundation Public Health (LFPH).

Cardea was trialed on the island of Aruba by SITA, one of the participating companies in the Cardea initiative. SITA is a good partner for this since they're responsible for a majority of airline industry IT systems. In the pilot, travelers could prove their Covid test status at restaurants and other tourist locations around the island. The combination of a good trust framework, and self-sovereign-identity-based credential technology allowed businesses to trust tourists' health information while preserving their privacy.

Another example of a working health pass credential ecosystem is the IATA Travel Pass. Travel Pass has been developed by the International Air Transport Association (IATA) and leverages verifiable credential technology from Evernym. The IATA Travel Pass is conducting initial pilots with over 50 airlines and the International Airlines Group (IAG).

A third example is Medcreds from ProofMarket. Medcreds uses the Trinsic API to provide service. MedCreds partnered with the Georgia Academy of General Dentistry to provide free COVID-19 test credentials to their providers and patients. MedCreds allows dentists to reduce the risk of spreading and contracting COVID-19 by knowing with greater certainty the status of a patient's most recent test.

Cardea, IATA Travel Pass, and Medcreds have tools for hospitals, labs, and other organizations who issue passes and for the airlines and other venues who verify them. All also have wallet apps for credential holders to use in receiving credentials as well as presenting the proofs that constitute the actual travel pass from those credentials. In addition, all three initiatives include governance frameworks that service their respective ecosystem.

Because all three are compliant with the GHPC's Blueprint and GCCN's architecture and guidance, the ecosystems they each create are part of the global ecosystem of ecosystems for health and travel passes. As more organizations, countries, and technical teams join in this, the global ecosystem of ecosystems will constitute the largest ever verifiable credential use case to date. The planning, design, and implementation show the level of effort necessary to create large credential ecosystems and provide a path for others to follow.

As I said in The Politics of Vaccination Passports, I'm persuaded that organizations like the Good Health Pass Collaborative and Global Covid Credential Network aren't the bad guys. They're just folks who see the inevitability of health and travel passes and are determined to see that it's done right, in ways that respect individual choice and personal privacy as much as possible.

Tags: verifiable+credentials identity ssi use+cases healthcare


Werdmüller on Medium

Storytelling for Ma

Coping with my mother’s death by telling her a story. Continue reading on Medium »

Coping with my mother’s death by telling her a story.

Continue reading on Medium »


Hyperonomy Digital Identity Lab

Trusted Digital Web: 8-Layer Architecture Reference Model (TDW-ARM)

Github: https://github.com/mwherman2000/TrustedDigitalWeb After about 2.5 years, I finally have an ARM that I like and a code base that is starting to show some promise… 8-Layer Architecture Reference Model (TDW-ARM) Click the model to enlarge it. Trusted Digital Assistant (TDA) … Continue reading →

Github: https://github.com/mwherman2000/TrustedDigitalWeb

After about 2.5 years, I finally have an ARM that I like and a code base that is starting to show some promise…

8-Layer Architecture Reference Model (TDW-ARM)

Click the model to enlarge it.

Trusted Digital Assistant (TDA) Prototype

Github: https://github.com/mwherman2000/TrustedDigitalWeb

Microsoft “Trinity” Graph Engine

Web site: https://www.graphengine.io/

Github: https://github.com/microsoft/GraphEngine

Verifiable Credential Notarization: User Scenarios 0.25 Verifiable Notarization Protocol (VCNP) 0.25 TDW Agents, Wallets, and VDR: ARM

Damien Bod

Sign-in using multiple clients or tenants in ASP.NET Core and Azure AD

The article shows how an ASP.NET Core application could implement a sign in and a sign out with two different Azure App registrations which could also be implemented using separate identity providers (tenants). The user of the application can decide to authenticate against either one of the Azure AD clients. The clients can also be […]

The article shows how an ASP.NET Core application could implement a sign in and a sign out with two different Azure App registrations which could also be implemented using separate identity providers (tenants). The user of the application can decide to authenticate against either one of the Azure AD clients. The clients can also be deployed on separate Azure Active directories. Separate authentication schemes are used for both of the clients. Each client requires a scheme for the Open ID Connect sign in and the cookie session. The Azure AD client authentication is implemented using Microsoft.Identity.Web.

Code: https://github.com/damienbod/AspNetCore6Experiments

The clients are setup to use a non default Open ID Connect scheme and also a non default cookie scheme. After a successful authentication, the OnTokenValidated event is used to sign into the default cookie scheme using the claims principal returned from the Azure AD client. “t1” is used for the Open ID Connect scheme and “cookiet1” is used for the second scheme. No default schemes are defined. The second Azure App Registration client configuration is setup in the same way.

services.AddAuthentication() .AddMicrosoftIdentityWebApp( Configuration.GetSection("AzureAdT1"), "t1", "cookiet1"); services.Configure<OpenIdConnectOptions>("t1", options => { var existingOnTokenValidatedHandler = options.Events.OnTokenValidated; options.Events.OnTokenValidated = async context => { await existingOnTokenValidatedHandler(context); await context.HttpContext.SignInAsync( CookieAuthenticationDefaults .AuthenticationScheme, context.Principal); }; }); services.AddAuthentication() .AddMicrosoftIdentityWebApp( Configuration.GetSection("AzureAdT2"), "t2", "cookiet2"); services.Configure<OpenIdConnectOptions>("t2", options => { var existingOnTokenValidatedHandler = options.Events.OnTokenValidated; options.Events.OnTokenValidated = async context => { await existingOnTokenValidatedHandler(context); await context.HttpContext.SignInAsync( CookieAuthenticationDefaults .AuthenticationScheme, context.Principal); }; });

The AddAuthorization is used in a standard way and no default policy is defined. We would like the user to have the possibility to choose against what tenant and client to authenticate.

services.AddAuthorization(); services.AddRazorPages() .AddMvcOptions(options => { }) .AddMicrosoftIdentityUI();

A third default scheme is added to keep the session after a successful authentication using the client schemes which authenticated. The identity is signed into this scheme after a successfully Azure AD authentication. The SignInAsync method is used for this in the OnTokenValidated event.

services.AddAuthentication(CookieAuthenticationDefaults.AuthenticationScheme) .AddCookie();

The Configure method is setup in a standard way.

public void Configure(IApplicationBuilder app) { app.UseHttpsRedirection(); app.UseStaticFiles(); app.UseRouting(); app.UseAuthentication(); app.UseAuthorization(); app.UseEndpoints(endpoints => { endpoints.MapRazorPages(); endpoints.MapControllers(); }); }

The sign in and the sign out needs custom implementations. The SignInT1 method is used to authenticate using the first client and the SignInT2 is used for the second. This can be called from the Razor page view. The CustomSignOut is used to sign out the correct schemes and redirect to the Azure AD endsession endpoint. The CustomSignOut method uses the clientId of the Azure AD configuration to sign out the correct session. This value can be read using the aud claim.

using System.Threading.Tasks; using Microsoft.AspNetCore.Authentication; using Microsoft.AspNetCore.Authentication.Cookies; using Microsoft.AspNetCore.Authorization; using Microsoft.AspNetCore.Mvc; using Microsoft.Extensions.Configuration; namespace AspNetCoreRazorMultiClients { [AllowAnonymous] [Route("[controller]")] public class CustomAccountController : Controller { private readonly IConfiguration _configuration; public CustomAccountController(IConfiguration configuration) { _configuration = configuration; } [HttpGet("SignInT1")] public IActionResult SignInT1([FromQuery] string redirectUri) { var scheme = "t1"; string redirect; if (!string.IsNullOrEmpty(redirectUri) && Url.IsLocalUrl(redirectUri)) { redirect = redirectUri; } else { redirect = Url.Content("~/")!; } return Challenge(new AuthenticationProperties { RedirectUri = redirect }, scheme); } [HttpGet("SignInT2")] public IActionResult SignInT2([FromQuery] string redirectUri) { var scheme = "t2"; string redirect; if (!string.IsNullOrEmpty(redirectUri) && Url.IsLocalUrl(redirectUri)) { redirect = redirectUri; } else { redirect = Url.Content("~/")!; } return Challenge(new AuthenticationProperties { RedirectUri = redirect }, scheme); } [HttpGet("CustomSignOut")] public async Task<IActionResult> CustomSignOut() { var aud = HttpContext.User.FindFirst("aud"); if (aud.Value == _configuration["AzureAdT1:ClientId"]) { await HttpContext.SignOutAsync(CookieAuthenticationDefaults.AuthenticationScheme); await HttpContext.SignOutAsync("cookiet1"); var authSignOut = new AuthenticationProperties { RedirectUri = "https://localhost:44348/SignoutCallbackOidc" }; return SignOut(authSignOut, "t1"); } else { await HttpContext.SignOutAsync(CookieAuthenticationDefaults.AuthenticationScheme); await HttpContext.SignOutAsync("cookiet2"); var authSignOut = new AuthenticationProperties { RedirectUri = "https://localhost:44348/SignoutCallbackOidc" }; return SignOut(authSignOut, "t2"); } } } }

The _LoginPartial.cshtml Razor view can use the CustomAccount controller method to sign in or sign out. The available clients can be selected in a drop down control.

<ul class="navbar-nav"> @if (User.Identity.IsAuthenticated) { <li class="nav-item"> <span class="navbar-text text-dark">Hello @User.Identity.Name!</span> </li> <li class="nav-item"> <a class="nav-link text-dark" asp-controller="CustomAccount" asp-action="CustomSignOut">Sign out</a> </li> } else { <li> <div class="main-menu"> <div class="dropdown"> <button class="btn btn-primary dropdown-toggle" type="button" id="dropdownLangButton" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> sign-in </button> <div class="dropdown-menu" aria-labelledby="dropdownLangButton"> <a class="dropdown-item" asp-controller="CustomAccount" asp-action="SignInT1" >t1</a> <a class="dropdown-item" asp-controller="CustomAccount" asp-action="SignInT2">t2</a> </div> </div> </div> </li> } </ul>

The app.settings have the Azure AD settings for each client as required.

{ "AzureAdT1": { "Instance": "https://login.microsoftonline.com/", "Domain": "damienbodhotmail.onmicrosoft.com", "TenantId": "7ff95b15-dc21-4ba6-bc92-824856578fc1", "ClientId": "46d2f651-813a-4b5c-8a43-63abcb4f692c", "CallbackPath": "/signin-oidc/t1", "SignedOutCallbackPath ": "/SignoutCallbackOidc" // "ClientSecret": "add secret to the user secrets" }, "AzureAdT2": { "Instance": "https://login.microsoftonline.com/", "Domain": "damienbodhotmail.onmicrosoft.com", "TenantId": "7ff95b15-dc21-4ba6-bc92-824856578fc1", "ClientId": "8e2b45c2-cad0-43c3-8af2-b32b73de30e4", "CallbackPath": "/signin-oidc/t2", "SignedOutCallbackPath ": "/SignoutCallbackOidc" // "ClientSecret": "add secret to the user secrets" },

When the application is started, the user can login using any client as required.

This works really good, if you don’t know which tenant is your default scheme. If you always use a default scheme with one tenant default, then you can use the multiple-authentication-schemes example like defined in the Microsoft.Identity.Web docs.

Links:

https://github.com/AzureAD/microsoft-identity-web/wiki/multiple-authentication-schemes

https://github.com/AzureAD/microsoft-identity-web/wiki/customization#openidconnectoptions

https://github.com/AzureAD/microsoft-identity-web

https://docs.microsoft.com/en-us/aspnet/core/security/authentication

Saturday, 26. June 2021

Timothy Ruff

Verifiable Credentials Aren’t Credentials. And They’re Not Verifiable In the Way You Might Think.

Last summer I published a 3-part series about the similarities between verifiable credentials (VCs) and physical shipping containers: Part 1: Verifiable Credentials Aren’t Credentials. They’re Containers. Part 2: Like Shipping Containers, Verifiable Credentials Will Economically Transform the World Part 3: How Verifiable Credentials Bridge Trust Domains The similarities have only bec

Last summer I published a 3-part series about the similarities between verifiable credentials (VCs) and physical shipping containers:

Part 1: Verifiable Credentials Aren’t Credentials. They’re Containers.

Part 2: Like Shipping Containers, Verifiable Credentials Will Economically Transform the World

Part 3: How Verifiable Credentials Bridge Trust Domains

The similarities have only become more apparent, and important, now that VCs are rapidly growing in global adoption. So it’s time for an update, including some history behind the VC name and why “verifiable” is also problematic.

It can be found at my new blog location: https://credentialmaster.com/verifiable-credentials-arent-credentials-theyre-containers/


The Verifiable Credentials (VC) Stack

SSI and verifiable credentials (VCs) have spawned hundreds of startups and significant initiatives within Microsoft, IBM, Workday, Samsung, Deutsche Telekom, the entire EU, the U.S. Government, the global credit union industry, 40+ airlines, hundreds of healthcare organizations, GLEIF (international financial regulator), GS1 (global standards body for QR and bar codes), and more. How do the

SSI and verifiable credentials (VCs) have spawned hundreds of startups and significant initiatives within Microsoft, IBM, Workday, Samsung, Deutsche Telekom, the entire EU, the U.S. Government, the global credit union industry, 40+ airlines, hundreds of healthcare organizations, GLEIF (international financial regulator), GS1 (global standards body for QR and bar codes), and more.

How do the various types of products and services fit together?

Which are complementary, and which are competitive?

Should the services from provider A work with those from provider B?

Will you still need provider B if you have provider A?

Without naming specific providers, “The VC Stack” is Credential Master’s attempt to develop a framework for answering these questions at the highest, simplest, non-technical level possible. It is well over a year in the making, with input from many of the world’s top experts in SSI.

The VC Stack considers VCs as the foundational building block for SSI, rather than identity, blockchain, or anything else. It organizes VCs into four essential functions: Applications, Management, Processing, and Storage.

This separation clarifies product functions and vendor roles, expands administrative capabilities, reduces vendor lock-in, lessens the impact of changes in VC standards and technology, and enables service providers to focus on what they do best.

Learn more from my recent detailed blog: https://credentialmaster.com/the-vc-lifecycle/. We invite any and all feedback to make it better.

Friday, 25. June 2021

Werdmüller on Medium

The open banking elephant

The American banking system sucks. It’s time for open banking. Continue reading on Medium »

The American banking system sucks. It’s time for open banking.