Friday, August 23, 2024

Local Development With an AT Proto PDS

Using docker and curl to do the first steps of what could be creating an atproto app.

The Bluesky Personal Data Server (PDS) comes with instructions and automation for taking over a server and running a node connected to bsky.app, but I wanted to run a local disconnected PDS so I just muck around with creating fake users and posting data as if I was creating a new atproto app.

Inside their system is a docker container, and we can run that by itself:

docker image pull ghcr.io/bluesky-social/pds:latest

You'll probably want data somewhere other than /tmp

mkdir -p /tmp/pdsdata/blocks

PDS_ADMIN_PASSWORD=$(openssl rand --hex 16)
PDS_JWT_SECRET=$(openssl rand --hex 16)
PLC_ROTATION_KEY=$(openssl ecparam --name secp256k1 --genkey --noout --outform DER | tail --bytes=+8 | head --bytes=32 | xxd --plain --cols 32)
PDS_DATADIR=/tmp/pdsdata
PDS_HOSTNAME=foo.bar.com

cat<<EOF >/tmp/pdsdata/pds.env
PDS_HOSTNAME=${PDS_HOSTNAME}
PDS_JWT_SECRET=${PDS_JWT_SECRET}
PDS_ADMIN_PASSWORD=${PDS_ADMIN_PASSWORD}
PDS_PLC_ROTATION_KEY_K256_PRIVATE_KEY_HEX=${PLC_ROTATION_KEY}
PDS_DATA_DIRECTORY=/pds
PDS_BLOBSTORE_DISK_LOCATION=/pds/blocks
PDS_DID_PLC_URL=https://plc.directory
PDS_BSKY_APP_VIEW_URL=https://api.bsky.app
PDS_BSKY_APP_VIEW_DID=did:web:api.bsky.app
PDS_REPORT_SERVICE_URL=https://mod.bsky.app
PDS_REPORT_SERVICE_DID=did:plc:ar7c4by46qjdydhdevvrndac
PDS_CRAWLERS=https://bsky.network
LOG_ENABLED=true
NODE_ENV=development
EOF

docker run -d --name pds --restart=unless-stopped --network=host -v /b2/bolson/Bluesky/pdsdata:/pds --env-file=/b2/bolson/Bluesky/pdsdata/pds.env ghcr.io/bluesky-social/pds:latest

With that, it's probably running. If you ever need any of the secrets, there in pdsdata/pds.env

Run a couple quick queries that should return server state
curl -L 'http://localhost:3000/xrpc/com.atproto.sync.listRepos?limit=100'
curl -L 'http://localhost:3000/xrpc/com.atproto.server.describeServer'

Let's create a user. First, create an invite code:
curl -L -X POST 'http://localhost:3000/xrpc/com.atproto.server.createInviteCode' --user "admin:${PDS_ADMIN_PASSWORD}" --data-raw '{"useCount":1}' -H 'Content-Type: application/json'

This will return something like
{"code":"foo-bar-com-zx6uq-cjkle"}

Then actually create the user:
curl -L -X POST 'http://localhost:3000/xrpc/com.atproto.server.createAccount' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
--user "admin:${PDS_ADMIN_PASSWORD}" \
--data-raw '{"handle": "bob2.foo.bar.com","inviteCode":"foo-bar-com-zx6uq-cjkle","email":"bob2@foo.bar.com","password":"hunter2"}'

 This returns a bunch of json about the created user, the parts we need for next steps are "did" and "accesJwt", like: 

"did":"did:plc:ks35mpkznuqalfrr3drm7tfu""accessJwt":"eyJhbGciOiJIUzI1NiJ9.eyJzY29wZSI6ImNvbS5hdHByb3RvLmFjY2VzcyIsImF1ZCI6ImRpZDp3ZWI6ZG8uYm9sc29uLm9yZyIsInN1YiI6ImRpZDpwbGM6a3MzNW1wa3pudXFhbGZycjNkcm03dGZ1IiwiaWF0IjoxNzI0NDI3NTk0LCJleHAiOjE3MjQ0MzQ3OTR9._uG-PTAqNLQT7PqohWFv-0lvyQtz_ud5XOwCqnjDPqs"

Let's put that in shell to make the next commands nicer

ACCESS_JWT="eyJhbGciOiJIUzI1NiJ9.eyJzY29wZSI6ImNvbS5hdHByb3RvLmFjY2VzcyIsImF1ZCI6ImRpZDp3ZWI6ZG8uYm9sc29uLm9yZyIsInN1YiI6ImRpZDpwbGM6a3MzNW1wa3pudXFhbGZycjNkcm03dGZ1IiwiaWF0IjoxNzI0NDI3NTk0LCJleHAiOjE3MjQ0MzQ3OTR9._uG-PTAqNLQT7PqohWFv-0lvyQtz_ud5XOwCqnjDPqs"

The first couple 'admin' APIs used HTTP user:password with the user as 'admin' and the password set for the whole server in setup back in pds.env. Later actions as a user put the access JWT blob into the Authorization header.

Post a record! (note validate=false because I'm not posting proper lexicon data)

curl -L -X POST 'http://localhost:3000/xrpc/com.atproto.repo.createRecord' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer '${ACCESS_JWT} \
--data-raw '{"repo":"did:plc:ks35mpkznuqalfrr3drm7tfu","collection":"com.bar.wat","rkey":"aoeustnh1234098","validate":false,"record":{"a":"foo","b":1234}}'

That should return json with info about the new created record. Then I did that a few times with different "rkey" and tweaking the record contents each time.

List the records back out:

curl -L 'http://localhost:3000/xrpc/com.atproto.repo.listRecords?repo=did:plc:ks35mpkznuqalfrr3drm7tfu&collection=org.bolson.vote'

Should return json:

{records:[{...},{...},{...}], "cursor":"some rkey"}

Great! Now I feel like I'm ready to put and retrieve data in an atproto PDS and start developing an app. There are atproto SDKs, but I wanted to make sure I really got it by doing it all in curl first.

Sunday, August 4, 2024

Goodbye Blockchain

In 2018 I started working in blockchain/cryptocurrency stuff at Algorand. They had a clever new system that fixed the gigawatt-waste problem of Bitcoin and was 100x faster, and I figured "I guess blockchain is something people want, and this is good tech, I'll try it".

I spent 4 years there and was always proud of the engineering quality and the culture had a healthy skepticism and distance from the hype around blockchain things.

What I really wanted from the dream of cryptocurrency is to unseat the Visa/MC duopoly and their $.30+3% drain on every transaction. I wanted transactions of any size, censorship resistant, for $0.01 or less. But current climate and laws prevent that.

I'm pretty tired of crypto-bro culture. I don't care about NFTs. Nobody should care about NFTs. "shitcoins" are explicitly value-less and people in the biz call them shitcoins and yet still worry about optimizing their tech to support more shitcoins. This sucks.

Crypto-bros apparently don't care that Twitter is a cesspool and feeding that machine makes the world a worse place. And they don't care that Bitcoin wastes gigawatts per year and must be doomed to fail.

One of the biggest innovations of Bitcoin may be that it created a distributed leaderless Ponzi scheme. Hype it enough and create the next generation of suckers to buy in and buy out your share at a profit.

The crypto markets seem to be entirely hype driven. Worse tech with more hype can beat better tech. It's a big downer when I want to take pride in working on better tech. I no longer care to take a bet on which blockchains will survive to 5 or 10 years from now. I quit.

New job: try to make the world a better place through good social media (and make a buck in the process) at Bluesky Social. I'll mostly be on backend just trying to keep the place running reliably and efficiently, but I'll be doing my part on new features too.

Follow me there at:
https://bsky.app/profile/bolson.bsky.social

Saturday, June 29, 2024

How Much Complexity Can Your Tribe Manage?

 I've heard of a Sci-Fi thought experiment: how many people do you need to ship on the 100 year journey to another solar system? We'd need some number of people to specialize in all of the things required for modern civilization, and some number of people to do all the farmer/plumber/electrician jobs to keep the world running. Some Sci-Fi brain thought the number was 10_000 people.

Ok, now think about your small company of 100 people. How much complexity can you manage? How many systems can you maintain?

I think I've worked a few small companies where the answer should have been 'fewer'. And a few less crucial things got neglected, and it was messy. Maybe there could have been a better way to consciously wind down the complexity and do fewer things better.

Thursday, June 20, 2024

Bitcoin Must Die

Bitcoin is horribly wasteful. The University of Cambridge Bitcoin estimate is that it currently burns 150 terawatt-hours per year. (2023 US generation was 4178 TWh.)

Multiply that by $0.0773/kWh (March 2024 EIA cost to industrial sector) and we get the operational cost of Bitcoin (power alone! doesn't include cost of hardware!) at $11.5 billion dollars a year.

As of this morning (2024-06-20) Coinbase estimates that all Bitcoin is worth about $1.3 trillion. The electricity operating cost should be depreciating that value by 0.9% per year (hardware costs should take another chunk out).

According to blockchain.com, there were 153_415_993 bitcoin txns in 2023. Divide out the operating eletricity cost and that's $75.58 per transaction. Recent txn fees are more like $5-$6. The Bitcoin system loses $70 on every transaction. The $1.3 trillion valuation is based on hype and hope and a bubble and the inflow of new rubes on the broad base of the lowest run of the pyramid scheme.

What a sustainable blockchain would be

Transaction fees for moving value from here to there (I think we can do better than credit card ($0.30 + 3%) or FedNow $0.05).
Transaction fees for recording a fact (comparable to notarizing, or trademark/copyright filing fees).
If all those fees are enough to pay for operations, and an acceptible margin of profit, congratulations, you have a financial institution.

Those 153_415_993 transactions on Bitcoin in 2023 come to an average of 7 transactions per second. In 2019 I demonstrated the Algorand blockchain achieving 75 TPS running on three Raspberry Pi 3B+ toy computers. Algorand*, Aptos* and other newer blockchains continue to offer transactions for under a penny, and could be sustainable at this rate.

BUT, the regulatory environment around blockchains continues to be murky at best, crushing at worst. It remains to see how this will settle out in jurisdictions around the world and if those markets will continue to have enough business to support the several blockchains competing to grow their network effect and win in the market.

(* I have a stake in Algorand and Aptos, other blockchains are probably also technically competent ;-) )

Monday, January 22, 2024

Cloud compute with SSD is a great deal

Want fast data processing? Use attached SSD on your cloud compute nodes. It's a great deal.

Consider the AWS m6gd.8xlarge instance type. It has an attached 1900 GB NVMe SSD. One source benchmarks it at 444_552 Read IOPS, 186_467 Write IOPS. It costs $1055.87/Month.

Right next door is the m6d.8xlarge instance without SSD for $899.36/Month.

Using the AWS price calculator, replacing that storage with EBS and provisioned IOPS we have:

gp3 16_000 IOPS 1900 GB: $217.00/Month

io2 64_000 IOPS 1900 GB: $3,773.50/Month

Neither of those plans is cheaper than the faster attached SSD.

This is in terms of AWS, but I'm pretty sure it applies to GCP and other providers too.

The benefit of EBS is that it's more durable, survives instance shutdowns and crashes. But, I've worked at several jobs where a large-ish chunk of data needs fast processing, but will be extensively backed up or otherwise redundantly stored elsewhere. Losing a server would be a temporary inconvenience easily recovered.