Tag: High Availability

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

A long time ago, in a galaxy far far away, ‘threads’ were a programming novelty rarely used and seldom trusted. In that environment, the first PostgreSQL developers decided forking a process for each connection to the database is the safest choice. It would be a shame if your database crashed, after all.

Since then, a lot of water has flown under that bridge, but the PostgreSQL community has stuck by their original decision. It is difficult to fault their argument – as it’s absolutely true that:

  • Each client having its own process prevents a poorly behaving client from crashing the entire database.
  • On modern Linux systems, the difference in overhead between forking a process and creating a thread is much lesser than it used to be.
  • Moving to a multithreaded architecture will require extensive rewrites.

However, in modern web applications, clients tend to open a lot of connections. Developers are often strongly discouraged from holding a database connection while other operations take place. “Open a connection as late as possible, close a connection as soon as possible”. But that causes a problem with PostgreSQL’s architecture – forking a process becomes expensive when transactions are very short, as the common wisdom dictates they should be. In this post, we cover the pros and cons of PostgreSQL connection pooling.

PostgreSQL Architecture DiagramThe PostgreSQL Architecture | Source

The Connection Pool Architecture

Using a modern language library does reduce the problem somewhat – connection pooling is an essential feature of most popular database-access libraries. It ensures ‘closed’ connections are not really closed, but returned to a pool, and ‘opening’ a new connection returns the same ‘physical connection’ back, reducing the actual forking on the PostgreSQL side.

Visual Representation of a Connection PoolThe architecture of a generic connection-pool

However, modern web applications are rarely monolithic, and often use multiple languages and technologies. Using a connection pool in each module is hardly efficient:

  • Even with a relatively small number of modules, and a small pool size in each, you end up with a lot of server processes. Context-switching between them is costly.
  • The pooling support varies widely between libraries and languages – one badly behaving pool can consume all resources and leave the database inaccessible by other modules.
  • There is no centralized control – you cannot use measures like client-specific access limits.

As a result, popular middlewares have been developed for PostgreSQL. These sit between the database and the clients, sometimes on a seperate server (physical or virtual) and sometimes on the same box, and create a pool that clients can connect to. These middleware are:

  • Optimized for PostgreSQL and its rather unique architecture amongst modern DBMSes.
  • Provide centralized access control for diverse clients.
  • Allow you to reap the same rewards as client-side pools, and then some more (we will discuss these more in more detail in our next posts)!

PostgreSQL Connection Pooler Cons

A connection pooler is an almost indispensable part of a production-ready PostgreSQL setup. While there is plenty of well-documented benefits to using a connection pooler, there are some arguments to be made against using one:

  • Introducing a middleware in the communication inevitably introduces some latency. However, when located on the same host, and factoring in the overhead of forking a connection, this is negligible in practice as we will see in the next section.
  • A middleware becomes a single point of failure. Using a cluster at this level can resolve this issue, but that introduces added complexity to the architecture.

    Redundant pgBouncer instances to prevent single point of failureRedundancy in middleware to avoid Single-Point-of-Failure | Source

  • A middleware implies extra costs. You either need an extra server (or 3), or your database server(s) must have enough resources to support a connection pooler, in addition to PostgreSQL.
  • Sharing connections between different modules can become a security vulnerability. It is very important that we configure pgPool or PgBouncer to clean connections before they are returned to the pool.
  • The authentication shifts from the DBMS to the connection pooler. This may not always be acceptable.

    PgBouncer Authentication ModelPgBouncer Authentication Model | Source

  • It increases the surface area for attack, unless access to the underlying database is locked down to allow access only via the connection pooler.
  • It creates yet another component that must be maintained, fine tuned for your workload, security patched often, and upgraded as required.

Should You Use a PostgreSQL Connection Pooler?

However, all of these problems are well-discussed in the PostgreSQL community, and mitigation strategies ensure the pros of a connection pooler far exceed their cons. Our tests show that even a small number of clients can significantly benefit from using a connection pooler. They are well worth the added configuration and maintenance effort.

In the next post, we will discuss one of the most popular connection poolers in the PostgreSQL world – PgBouncer, followed by Pgpool-II, and lastly a performance test comparison of these two PostgreSQL connection poolers in our final post of the series.

from High Scalability

Sponsored Post: Fauna, Sisu, Educative, PA File Sight, Etleap, PerfOps, Triplebyte, Stream

Sponsored Post: Fauna, Sisu, Educative, PA File Sight, Etleap, PerfOps, Triplebyte, Stream

Who’s Hiring? 

  • Sisu Data is looking for machine learning engineers who are eager to deliver their features end-to-end, from Jupyter notebook to production, and provide actionable insights to businesses based on their first-party, streaming, and structured relational data. Apply here.
  • Triplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Make your job search O(1), not O(n). Apply here.
  • Need excellent people? Advertise your job here! 

Cool Products and Services

  • Stateful JavaScript Apps. Effortlessly add state to your Javascript apps with FaunaDB. Generous free tier. Try now!
  • Grokking the System Design Interview is a popular course on Educative.io (taken by 20,000+ people) that’s widely considered the best System Design interview resource on the Internet. It goes deep into real-world examples, offering detailed explanations and useful pointers on how to improve your approach. There’s also a no questions asked 30-day return policy. Try a free preview today.
  • PA File Sight – Actively protect servers from ransomware, audit file access to see who is deleting files, reading files or moving files, and detect file copy activity from the server. Historical audit reports and real-time alerts are built-in. Try the 30-day free trial!
  • For heads of IT/Engineering responsible for building an analytics infrastructure, Etleap is an ETL solution for creating perfect data pipelines from day one. Unlike older enterprise solutions, Etleap doesn’t require extensive engineering work to set up, maintain, and scale. It automates most ETL setup and maintenance work, and simplifies the rest into 10-minute tasks that analysts can own. Read stories from customers like Okta and PagerDuty, or try Etleap yourself.
  • PerfOps is a data platform that digests real-time performance data for CDN and DNS providers as measured by real users worldwide. Leverage this data across your monitoring efforts and integrate with PerfOps’ other tools such as Alerts, Health Monitors and FlexBalancer – a smart approach to load balancing. FlexBalancer makes it easy to manage traffic between multiple CDN providers, API’s, Databases or any custom endpoint helping you achieve better performance, ensure the availability of services and reduce vendor costs. Creating an account is Free and provides access to the full PerfOps platform.
  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorialStream is free up to 3 million feed updates so it’s easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we’d like to ad a few zeros to that number. Check out the job opening on AngelList.
  • Advertise your product or service here! 

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.


PA File Sight monitors file access on a server in real-time.

It can track who is accessing what, and with that information can help detect file copying, detect (and stop) ransomware attacks in real-time, and record the file activity for auditing purposes. The collected audit records include user account, target file, the user’s IP address and more. This solution does NOT require Windows Native Auditing, which means there is no performance impact on the server. Join thousands of other satisfied customers by trying PA File Sight for yourself. No sign up is needed for the 30-day fully functional trial.


Make Your Job Search O(1) — not O(n)

Triplebyte is unique because they’re a team of engineers running their own centralized technical assessment. Companies like Apple, Dropbox, Mixpanel, and Instacart now let Triplebyte-recommended engineers skip their own screening steps.

We found that High Scalability readers are about 80% more likely to be in the top bracket of engineering skill.

Take Triplebyte’s multiple-choice quiz (system design and coding questions) to see if they can help you scale your career faster.


If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

from High Scalability

Stuff The Internet Says On Scalability For October 11th, 2019

Stuff The Internet Says On Scalability For October 11th, 2019

 Wake up! It’s HighScalability time:

Light is fast—or is it?

Do you like this sort of Stuff? I’d greatly appreciate your support on Patreon. And I wrote Explain the Cloud Like I’m 10 for all who want to understand the cloud. On Amazon it has 57 mostly 5 star reviews (135 on Goodreads). Please consider recommending it. You’ll be a cloud hero.

Number Stuff:

  • 1,717,077,725: number of web servers in 2019. In 1994? 623.
  • 7,000,000,000,000: LinkedIn messages sent per day with Apache Kafka (sort of).
  • more data ever: collected by the LSST (Large Synoptic Survey Telescope) in its first year than all telescopes have ever collected—combined . It will do so for 10 years. That’s 15TB of data collected every night.
  • 3200 megapixel: LSST camera sensor, 250x better than an iphone, the equivalent of half a basketball court filled with 4k TVs to gather one raw image. 
  • 4 million: new jobs created in Africa because of investment in cell phone networks.
  • 442%: ROI running Windows workloads on AWS; 56% lower five-year cost of operations; 98% less unplanned downtime; 31% higher internal customer satisfaction; 37% lower IT infrastructure costs; 75% more efficient IT infrastructure team; 26% higher developer productivity; 32% higher gross productivity.
  • several petabytes: logs generated per hour by millions of machines at Facebook. Scribe processes logs with an input rate that can exceed 2.5 terabytes per second and an output rate that can exceed 7 terabytes per second. 
  • lowest: spending on tech acquisitions are the lowest quarterly level in nearly two years due to rising global uncertainty coupled with slowing economic growth. 
  • 5%-8%: per year battery energy density improvement. We expect the storage per unit mass and volume of batteries will probably plateau within 10 to 20 years. At the same time, the market penetration of lithium batteries is doubling every 4 to 5 years. 
  • 27: tech companies raised $100 million or more, taking in a total of $7.1 billion during the month of September.
  • 50%: price reduction for Intel’s Cascade Lake X-Series. 
  • 16x: Redis faster at reading JSON blobs compared to PostgreSQL.
  • $225B: worth of Google cloud. @lacker: I was working at Google when AWS launched in 2006. The internal message was, we don’t want to build a competitor. We have the technology to compete in this area, but it is fundamentally a low-margin business, whereas reinvesting in our core business is high-margin, so we should keep our infrastructure to ourselves.
  • $21.9B: app (App Store 65%, Google Play 35%) revenue in Q3 2019, a 23% increase. WhatsApp is #1. TikTok is #2. Messenger is #3. Facebook is #4. Instagram is #5. Mobile gaming is 74% of total revenue. 
  • 77,000: virtual neurons simulated in real-time on a 1 million processor supercomputer. 

 Quotable Stuff:

  • @sh: What are some famous last words in your industry? @Liv_Lanes: “How does it scale?”
  • Erin Griffith: It is now difficult for “a growth-at-all-costs company burning hundreds of millions of dollars with negative unit economics” to get funding, he said. “This is going to be a healthy reset for the tech industry.”
  • @garykmrivers: Milk delivery 25 years ago was essentially a subscription service offering products with recyclable/reusable packaging, delivered by electric vehicles. Part of me thinks that if a techie firm were to have proposed this same idea today people would think it was incredible.
  • @investing_cit: Costco is a fascinating business. You know all those groceries you buy? Yeah, they basically sell those at break even and then make all of their profit from the $60 annual membership fees. This is the key. The company keeps gross margins as low as possible.  In turn, this gives it pricing authority. In other words, you don’t even look at the price because you know it’s going to be the best. Off of merchandise, Costco’s gross margins are only 11%. Compare this to Target. Gross margins are almost 30%. Or against Walmart. About 25%. The company sells its inventory typically before it needs to pay suppliers. In other words, suppliers do what Costco tells them to do. Costco has essentially aggregated demand which it can then leverage against its suppliers in the form of payment terms. See, the DSI and DPO are basically the same.  On top of this, Costco collects cash in about 4 days, so that’s the extent of the cash conversion cycle.
  • peterwwillis: In fact, I’m going to make a very heretical suggestion and say, don’t even start writing app code until you know exactly how your whole SDLC, deployment workflow, architecture, etc will work in production. Figure out all that crap right at the start. You’ll have a lot of extra considerations you didn’t think of before, like container and app security scanning, artifact repository, source of truth for deployment versions, quality gates, different pipelines for dev and prod, orchestration system, deployment strategy, release process, secrets management, backup, access control, network requirements, service accounts, monitoring, etc. 
  • Jessica Quillin: We are likely facing a new vision for work, one in which humans work at higher levels of productivity (think less work, but more output), thanks to co-existing with robots, working side-by-side personal robots, digital assistants, or artificial intelligence tools. Rather than being bogged down by easily automated processes, humans can leverage robots to focus on more abstract, creative tasks, bringing about new innovative solutions.
  • Edsger W. Dijkstra: Abstraction is not about vagueness, it is about being precise at a new semantic level.
  • @TooMuchMe: Tomorrow, the City of Miami will vote on whether to grant a 30-year contract on light poles that will have cameras, license plate readers and flood sensors. For free. The catch: Nothing would stop the contracting company from keeping all your data and selling it to others.
  • K3wp: I used to work in the same building as [Ken Thompson]. He’s a nice guy, just not one for small talk. Gave me a flying lesson (which terrified me!) once. My father compares him to Jamie Hyneman, which is apt. Just a gruff, no-nonsense engineer with no time or patience for shenanigans
  • Richard Lawson: McDonnell recalls, wistfully, the bygone days, when a creator could directly email the guy who ran YouTube’s homepage. These days, nearly every creator I spoke to seemed haunted and awed by the platform’s fabled algorithm. They spoke of it as one would a vague god or as a scapegoat, explaining away the fading of clout or relevance.
  • @Inc: A 60-year-old founder is 3 times as likely to found a successful startup as a 30-year-old founder.
  • Andy Greenberg: Elkins programmed his tiny stowaway chip to carry out an attack as soon as the firewall boots up in a target’s data center. It impersonates a security administrator accessing the configurations of the firewall by connecting their computer directly to that port. Then the chip triggers the firewall’s password recovery feature, creating a new admin account and gaining access to the firewall’s settings. 
  • DSHR: If running a Lightning Network node were to be even a break-even business, the transaction fees would have to more than cover the interest on the funds providing the channel liquidity. But this would make the network un-affordable compared with conventional bank-based electronic systems, which can operate on a net basis because banks trust each other.
  • Marc Benioff: What public markets do is indeed the great reckoning. But it cleanses [a] company of all of the bad stuff that they have. I think in a lot of private companies these days, we’re seeing governance issues all over the place. I can’t believe this is the way they were running internally in all of these cases. They are staying private way too long.
  • Benjamin Franklin: I began now gradually to pay off the debt I was under for the printing house. In order to secure my credit and character as a tradesman, I took care not only to be in reality industrious and frugal, but to avoid all appearances to the contrary. I dressed plainly; I was seen at no places of idle diversion; I never went out a-fishing or shooting; a book, indeed, sometimes debauched me from my work, but that was seldom, snug, and gave no scandal; and, to show that I was not above my business, I sometimes brought home the paper I purchased at the stores through the streets on a wheelbarrow. Thus, being esteemed an industrious, thriving young man, and paying duly for what I bought, the merchants who imported stationery solicited my custom; others proposed supplying me with books, and I went on swimmingly.
  • @brightball: “BMW’s greatest product isn’t a car, it’s the factory.” – Best quote from #SAFeSummit #SAFe @ScaledAgile
  • @robmay: As an example, some research shows that more automation in warehouses increases overall humans working in the industry.  Why?  Because when you lower the human labor costs of a warehouse, you can put more warehouses in smaller towns that weren’t economically feasible before.  Having more automation will, initially, increase the desire for human skills like judgment, empathy, and just good old human to human interaction in some fields. The important point here is that you can’t think linearly about what will happen. It’s not a 1:1 replacement of automation taking human jobs. It is complex, and will change work in many different ways.
  • Quinn: The most consistent mistake that everyone makes when using AWS—this extends to life as well—is once people learn something, they stop keeping current on that thing. There is an entire ecosystem of people who know something about AWS, with a certainty. That is simply no longer true, because capabilities change. Restrictions get relaxed. Constraints stop applying. If you learned a few years ago that there are only 10 tags permitted per resource, you aren’t necessarily keeping current to understand that that limit is now 50.
  • @BrianRoemmele: Consider: The 1843 facsimile machine invented by Alexander Bain a clock inventor. A clock synchronize movement of two pendulums for line-by-line scanning of a message. It wasn’t until the 1980s that network effect, cost of machines made it very popular. Mechanical to digital. 
  • @joshuastager: “If the ISPs had not repeatedly sued to repeal every previous FCC approach, we wouldn’t be here today.” – @sarmorris
  • @maria_fibonacci: – Make each program do one thing well. – Expect the output of every program to become the input to another, as yet unknown, program. I think the UNIX philosophy is very Buddhist 🙂
  • @gigastacey: People love their Thermomixers so much that of the 3 million connected devices they have sold, those who use their app have a 50% conversion to a subscription. That is an insane conversion rate. #sks2019
  • eclipsetheworld: I think this quote sums up my thoughts quite nicely: “When I was a product manager at Facebook and Instagram, building a true content-first social network was the holy grail. We never figured it out. Yet somehow TikTok has cracked the nut and leapfrogged everyone else.” — Eric Bahn, General Partner at Hustle Fund & Ex Instagram Product Manager
  • Doug Messier: The year 2018 was the busiest one for launches in decades. There were a total of 111 completely successful launches out of 114 attempts. It was the highest total since 1990, when 124 launches were conducted. China set a new record for launches in 2018. The nation launched 39 times with 38 successes in a year that saw a private Chinese company fail in the country’s first ever orbital launch attempt. The United States was in second place behind China with 34 launches. Traditional leader Russia launched 20 times with one failure. Europe flew eight times with a partial failure, followed by India and Japan with seven and six successful flights, respectively.
  • John Preskill: The recent achievement by the Google team bolsters our confidence that quantum computing is merely really, really hard. If that’s true, a plethora of quantum technologies are likely to blossom in the decades ahead.
  • Quinn: What people lose sight of is that infrastructure, in almost every case, costs less than payroll.
  • Lauren Smiley: More older people than ever are working: 63% of Americans age 55 to 64 and 20% of those over 65. 
  • Sparkle: These 12 to 18 core CPU have lost most of their audience. The biggest audience for these consumer CPU was video editing and streaming. Video encoding and decoding with Nvidia NVENC is 10 times faster and now has the same or higher quality than CPU encoding. Software like OBS, Twitch studio, Handbrake, Sony Vegas now all support NVENC. The only major software suite that doesn’t support NVENC officially yet is Premiere.
  • Timothy Prickett Morgan: To move data from DRAM memory on the PIM modules to one of the adjacent DPUs on the memory chips takes about 150 picoJoules (pJ) of energy, and this is a factor of 20X lower than what it costs to move data from a DRAM chip on a server into the CPU for processing. It takes on the order of 20 pJ of energy to do an operation on that data in the PIM DPU, which is inexplicably twice as much energy in this table. The server with PIM memory will run at 700 watts because that in-memory processing does not come for free, but we also do not think that a modern server comes in at 300 watts of wall power.
  • William Stein: The supply/demand pendulum has swung away from providers in favor of customers, with various new entrants bringing speculative supply online, while the most voracious consumers remain in digestion mode. Ultimately, we believe it’s a question of when, not if hyperscale procurement cycles enter their next phase of growth, and the pendulum can swing back the other direction quickly.
  • Jen Ayers: Big-game hunters are essentially targeting people within an organization for the sole purpose of identifying critical assets for the purpose of deploying their ransomware. Hitting] one financial transaction server, you can charge a lot more for that than you could for a thousand consumers with ransomware—you’re going to make a lot more money a lot faster.
  • Eric Berger: Without the landing vision system, the rover would most likely still make it to Mars. There is about an 85% chance of success. But this is nowhere near good enough for a $2 billion mission. With the landing camera and software Johnson has led development of, the probability of success increases to 99%.
  • s32167: The headline should be “Intel urges everyone to use new type of memory that lowers performance for every CPU architecture to fix their own architecture security issues.”
  • Robert Haas: So, the “trap” of synchronous replication is really that you might focus on a particular database feature and fail to see the whole picture. It’s a useful tool that can supply a valuable guarantee for applications that are built carefully and need it, but a lot of applications probably don’t report errors reliably enough, or retry transactions carefully enough, to get any benefit.  If you have an application that’s not careful about such things, turning on synchronous replication may make you feel better about the possibility of data loss, but it won’t actually do much to prevent you from losing data.
  • Scott Aaronson: If you were looking forward to watching me dismantle the p-bit claims, I’m afraid you might be disappointed: the task is over almost the moment it begins. “p-bit” devices can’t scalably outperform classical computers, for the simple reason that they are classical computers. A little unusual in their architecture, but still well-covered by the classical Extended Church-Turing Thesis. Just like with the quantum adiabatic algorithm, an energy penalty is applied to coax the p-bits into running a local optimization algorithm: that is, making random local moves that preferentially decrease the number of violated constraints. Except here, because the whole evolution is classical, there doesn’t seem to be even the pretense that anything is happening that a laptop with a random-number generator couldn’t straightforwardly simulate. 
  • Handschuh: Adding security doesn’t happen by chance. In some cases it requires legislation or standardization, because there’s liability involved if things go wrong, so you have to start including a specific type of solution that will address a specific problem. Liability is what’s going to drive it. Nobody will do it just because they are so paranoid that they think that it must be done. It will be somebody telling them
  • Battery: The best marketing today—particularly mobile marketing—is not about providing a point solution but, instead, offering a broader technology ecosystem to understand and engage customers on their terms. The Braze-powered Whopper campaign, for instance, helped transform an app that had been primarily a coupon-delivery service into a mobile-ordering system that also offered a deeper connection to the Burger King brand.
  • Jakob: I think that we need to think of programming just like any other craft, trade, or profession with an intersection on everyday life: it is probably good to be able to do a little bit of it at home for household needs. But don’t equate that to the professional development of industrial-strength software.  Just like being able to use a screwdriver does not mean you are qualified to build a house, being able to put some blocks or lines of code together does not make you a programmer capable of building commercial-grade software.
  • @benedictevans: TikTok is introducing Americans to a question that Europeans have struggled with for 20 years: a lot of your citizens might use an Internet platform created somewhere that doesn’t know or care about your laws or cultural attitudes and won’t turn up to a committee hearing
  • Robert Pollack: So let me say something about our uniqueness, which is embedded in our DNA. Simple probabilities. Every base pair in DNA has four possible base pairs. Three billion letters long. Each position in the text could have one of four choices. So how many DNAs are there? There are four times four two-letter words in DNA, four for the first letter, four for the second—sixteen possible two-letter words. Sixty-four possible three-letter words. That is to say, how many possible human genomes are there? Four to the power 3 billion, which is to say a ridiculous, infinite number. There are only 1080 elementary particles in the universe. Each of us is precisely, absolutely unique while we are alive. And in our uniqueness, we are absolutely different from each other, not by more or less, but absolutely different.

Useful Stuff:

  • After 2000 years of taking things apart into smaller things, we have learned that all matter is made of molecules, and that molecules are made of atoms. Has Reductionism Run its Course? Or in the context of the cloud: Has FaaS Run Its Course? The “everything is a function” meme is a form of reductionism. And like reductionism in science FaaS reductionism has been successful, as the “business value” driven crowd is fond of pointing out. But that’s not enough when you want to understand the secrets of the universe, which in this analogy is figuring how to take the next step in building systems. Lambda is like the Large Hadron Collider in that it confirmed the standard model, but hasn’t moved us forward. At some point we need to stop looking at functions and explore using some theory driven insight. We see tantalizing bits of a greater whole as we layer abstractions on top of functions. There are event busses, service meshes, service discovery services, work flow systems, pipelines, etc.—but these are all still part of the standard model of software development. Software development like physics is stuck looking for a deeper understanding of its nature, yet we’re trapped in a gilded cage of methodological reductionism. Like for physics,  “the next step forward will be a case of theory reduction that does not rely on taking things apart into smaller things.”
  • SmashingConf Freiburg 2019 videos are now available. You might like The Anatomy Of A Click
  • There’s a lot of energy and ideas at serverlessconf:
    • If you’re looking for the big picture: ServerlessConf NYC 2019: everything you missed
    • @jeremy_daly: Great talk by @samkroon. Every month, @acloudguru uses 240M Lambda calls, 180M API Gateway calls, and 90TB of data transfer through CloudFront. Total cost? ~$2,000 USD. #serverless #serverlessftw #Serverlessconf
    • @ryans140: We’re in a similar situation.  3 environments,  60+ microservices,  serverless datakake. $1400 a month.   Down from $12k monthly in a  vm based datacenter.
    • @gitresethard: This is a very real feeling at #Serverlessconf this year. There’s a mismatch between the promise of focusing on your core differentiators and the struggle with tooling that hasn’t quite caught up.
    • @hotgazpacho: “Kubernetes is over-hyped and elevating the least-interesting part of your application. Infrastructure should be boring.” – @lindydonna 
    • @QuinnyPig: Lambda: “Get a file” S3: “Here it is.” There’s a NAT in there.  (If it’s the Managed NAT gateway you pay a 4.5¢ processing fee / @awscloud tax on going about your business.) #serverlessconf
    • @ben11kehoe: Great part the @LEGO_Group serverless story: they started with a single Lambda, to calculate sales tax. Your journey can start with a small step! #Serverlessconf
    • @ryanjonesirl: Thread about @jeremy_daly talk at #Serverlessconf #Serverless is (not so) simple Relational and Lambda don’t mix well.
    • @jssmith: Just presented the Berkeley View on #Serverless at #Serverlessconf
      • Serverless is more than FaaS
      • Cloud programming simplified
      • Next phase in cloud evolution
      • Using servers will seem like using assembly language
      • Serverless computing will server just about every use case
      • Serverless computing bill will converge to the serverful cost
      • Machine learning will play an important role in optimizing execution
      • Serverless computing will embrace heterogeneous hardware (GPU, TPU, etc) 
      • Serverful cloud computing will decline relative to serverless computing
  • Awesome writeup. Lots to learn on how to handle 4,200 Black Friday orders per minute, especially if you’re interested in running an ecommerce site on k8s in AWS using microservices. Building and running application at scale in Zalando
    • In our last Black Friday, we broke all the records of our previous years, and we had around 2 million orders. In the peak hour, we reached more than 4,200 orders per minute.
    • We have come from a long run, we have migrated from monolith to microservices around 2015. Nowadays, in 2019, we have more than 1,000 microservices. Our current tech organization is composed from more than 1,000 developers, and we are more than 200 teams. Every team is organized strategically to cover a customer journey, and also a business thing. Every team can also have different team members with multidisciplinary skills like frontend, backend data scientists, UX, researcher, product, whatever is needed that our team needs to fulfill.
    • Since we have all of these things, we also have end-to-end responsibility for the services that every team has to manage…We also found out that it’s not easy that every team do their way, so we end up having standard processes of how we develop software. This was enabled by the tools that our developer productivity team provides us. Every team can easily start a new project, can set it up, can start coding, build it, test it, deploy it, monitor it, and so on, in all the software development cycle
    • All our microservices are run in AWS and Kubernetes. When we migrated from monolith to microservices, we also migrated to the cloud. We start to use AWS like EC2 instances and cloud formations…All our microservices, not only checkout, but also lambda microservices are running in containers. Every microservice environment is obstructed from our infrastructure.
    • After this, we also have frontend fragments, which are frontend microservices. Frontend microservices are services that provide server-side rendering of what we call fragments. A fragment is a piece of a page, for example, a header, a body, a content, or a footer. You can have one page where you can see one thing, but every piece can be something that different teams owned.
    • Putting it all together, we do retries of operations with exponential back off. We wrap operations with the circuit breaker. We handle failures with fallbacks when possible. Otherwise, we have to make sure to handle the exceptions to avoid unexpected errors.
    • Every microservice that we have has the same infrastructure. We have a load balancer who handles the incoming request. Then this distributes the request through the replication of our microservice in multiple instances, or if we are using Kubernetes in multiple ports. Every instance is running with a Zalando-based image. This Zalando-based image contains a lot of things that are needed to be compliant, to be secure, to make sure that we have the right policies implemented because we are a serious company, and because we take seriously our business
    • What we didn’t know is that when we have more instances, it also means that we have more database connections. Before, even if we were having 26 million active customers using the website in different patterns, it was not a problem. Now, we have 10 times more instances creating connections to our Cassandra database. The poor Cassandra was not able to handle all of these connections.
    • Consider doing rollouts, consider having the same capacity for the current traffic that you have. Otherwise, your service is likely to become unavailable, just because you’ve introduced a new feature, but you have to make sure that this is also handled.
    • For our Black Friday preparation, we have a business forecast for tellers, we want to make this and that amount of orders, then we also have load testing of real customer journey
    • Then all the services involved in all this journey are identified, then we had to load testing in top of this. With this week, we were able to do capacity planning, so we could scale our service accordingly, and we could also identify bottlenecks, or things that we might need to fix for Black Friday.
    • For every microservice that is involved in Black Friday, we also have a checklist where we review, is the architecture and dependencies reviewed? Are the possible points of failures identified and mitigated? Do we have reliability patterns for all our microservices that are involved? Are configurations adjustable without need of deployment?
    • we are one company doing Black Friday. Then we have other 100 companies or more also doing Black Friday. What happened to us already in one Black Friday, I think, or two, was that AWS run out of resources. We don’t want to make a deployment and start new instances because we might get into the situation where we get no more resources in AWS
    • In the final day of Black Friday, we have a situation room. All teams that are involved in the services that are relevant for the Black Friday are gathered in one situation room. We only have one person per team. Then we are all together in this space where we monitor, and we support each other in case there is an incident or something that we need to handle
  • Videos from CppCon 2019 are now available. You might like Herb Sutter “De-fragmenting C++: Making Exceptions and RTTI More Affordable and Usable
  • Introducing SLOG: Cheating the low-latency vs. strict serializability tradeoff: Bottom line: there is a fundamental tradeoff between consistency and latency. And there is another fundamental tradeoff between serializability and latency… it is impossible to achieve both strict serializability and low latency reads and writes…By cheating the latency-tradeoff, SLOG is able to get average latencies on the order of 10 milliseconds for both reads and writes for the same geographically dispersed deployments that require hundreds of milliseconds in existing strictly serializable systems available today. SLOG does this without giving up strict serializability, without giving up throughput scalability, and without giving up availability (aside from the negligible availability difference relative to Paxos-based systems from not being as tolerant to network partitions). In short, by improving latency by an order of magnitude without giving up any other essential feature of the system, an argument can be made that SLOG is strictly better than the other strictly serializable systems in existence today.
  • Data races are very hard to find. Usually the way you find them is a late night call when a system locks up for no discernable reason. So it’s remarkable Google’s Kernel Concurrency Sanitizer (KCSAN) found over 300  data race conditions within the Linux kernel. Here’s the announcement.
  • How much will you save running Windows on AWS? A lot says IDC in The Infrastructure Cost and Staff Productivity Benefits of Running High-Performing Windows Workloads in the AWS Cloud: Based on interviews with these organizations, IDC quantifies the value they will achieve by running Windows workloads on AWS at an average of $157,300 per 100 users per year ($6.59 million per organization)…IT infrastructure cost reductions: Study participants reduce costs associated with running on-premises environments and benefit from more efficient use of infrastructure and application licenses…IT staff productivity benefits: Study participants reduce the day-to-day burden on IT…Risk mitigation — user productivity benefits: Study participants minimize the operational impact of unplanned application outages…Business productivity benefits: Study participants better address business opportunities and provide their employees with higher-performing and more timely applications and features infrastructure, database, application management, help desk, and security teams and enable application development teams to work more effectively.
    • Food and beverage organization: “We definitely go ‘on the cheap’ to start with AWS because it’s easy just to add extra storage per server instance in seconds. We will spin up a workload with what we feel is the minimum, and then add to it as needed. It definitely has putus in a better place to utilize resources regarding  services and infrastructure.
    • Healthcare organization: Licensing cost efficiencies was one of the reasons we went to the cloud with AWS. The way that you collaborate these licensing contracts through AWS for software licenses versus having to buy the licenses on our own has already been more cost effective for us. We’re saving 10%.
  • A fun approach to learning SQL. NUKnightLab/sql-mysteries: There’s been a Murder in SQL City! The SQL Murder Mystery is designed to be both a self-directed lesson to learn SQL concepts and commands and a fun game for experienced SQL users to solve an intriguing crime.
  • Caching improves your serverless application’s scalability and performance. It helps you keep your cost in check even when you have to scale to millions of users. All you need to know about caching for serverless applications: Lambda auto-scales by traffic. But it has limits… if your traffic is very spiky then the 500/min limit will be a problem…Caching improves response time as it cuts out unnecessary roundtrips…My general preference is to cache as close to the end-user as possible…Where should you implement caching? Route53 as the DNS.CloudFront as the CDN. API Gateway to handle authentication, rate limiting and request validation. Lambda to execute business logic. DynamoDB as the database.
  • To quote the Good Place, “This is forked.” But in a good way. A Multithreaded Fork of Redis That’s 5X Faster Than Redis
    • In regards to why fork Redis in the first place, KeyDB has a different philosophy on how the codebase should evolve. We feel that ease of use, high performance, and a “batteries included” approach is the best way to create a good user experience. While we have great respect for the Redis maintainers it is our opinion that the Redis approach focusses too much on simplicity of the code base at the expense of complexity for the user. This results in the need for external components and workarounds to solve common problems.
    • KeyDB works by running the normal Redis event loop on multiple threads. Network IO, and query parsing are done concurrently. Each connection is assigned a thread on accept(). Access to the core hash table is guarded by spinlock. Because the hashtable access is extremely fast this lock has low contention. Transactions hold the lock for the duration of the EXEC command. Modules work in concert with the GIL which is only acquired when all server threads are paused. This maintains the atomicity guarantees modules expect.
    • @kellabyte: I’ve been saying for years the architecture of Redis has been poorly designed in it’s single threaded nature among several other issues. KeyDB is a multi-threaded fork that attempts to fix some of these issues and achieves 5x the perf. Antirez has convinced a lot of people that whatever he says must be true 😛 Imaging running 64 instances of redis on a 64 core box? Oh god haha…I do. Having built Haywire up to 15 million HTTP requests/second using the same architecture myself I believe the numbers. It’s good engineering.
  • Frugal computing: Companies care about cheap computing…How can we trade-off speed with monetary cost of computing?…With frugal computing, we should try to avoid the cost of state synchronization as much as possible. So work should be done on one machine if it is cheaper to do so and the generous time budget is not exceeded…Memory is expensive but storage via local disk is not. And time is not pressing. So we can consider out-of-core execution, juggling between memory and disk…Communication costs money. So batching communication and trading off computation with communication…We may then need schemes for data-naming (which may be more sophisticated then simple key), so that a node can locate the result it needs in S3 instead of computing itself. This can allow nodes to collaborate with other nodes in an asynchronous, offline, or delay-tolerant way…In frugal computing, we cannot afford to allocate extra resources for fault-tolerance, and we need to do in a way commensurate with the risk of fault and the cost of restarting computation from scratch. Snapshots that are saved for offline collaboration may be useful for building frugal fault-tolerance.
  • A good summary from DevSecCon Seattle 2019 Round Up
  • Corruption is a work around, it’s a utility in a place where there are fewer better options to solve a problemInnovation is the antidote to corruption~ Corruption is not the problem hindering our development. In fact, conventional thinking on corruption and its relationship to development is not only wrong it’s holding many poor countries back…many programs fail to reduce corruption because we have the equation backwards. Societies don’t develop because they’ve reduced corruption, they are able to reduce corruption because they’ve developed. And societies develop through investment in innovation…there’s a relationship between scarcity and corruption, in most poor countries way too many basic things are scarce…this creates the perfect breeding ground for corruption to occur…investing in businesses that make things affordable and accessible to more people attacks this scarcity and creates the revenues for governments to reinvest in their economies. When this happens on a country wide level it can revolutionize nations…as South Korea became prosperous it was able to transition from an authoritarian government to a democratic government and has been able to reinvest in building its institutions and this has payed off…what we found when we looked at most prosperous countries today they were able to reduce corruption as they became prosperous, not before.
  • My take on: Percona Live Europe and ProxySQL Technology Day: It comes without saying that MySQL was the predominant and the more interesting tracks were there. This not because I come from MySQL, but because the ecosystem was helping the track to be more interesting. Postgres was having some interesting talk, but let us say clearly, we had just few from the community. Mongo was really low in attendee. The number of attendees during the talks and the absence of the MongoDb community was clearly indicating that the event is not in the are of interest of the MongoDB utilizers.
  • Put that philosophy degree to work. Study some John Stuart Mill and you’re ready for a job in AI. What am I talking about? Peter Norvig in Artificial Intelligence: A Modern Approach talks about how AI started out by defining AI as maximize expected utility; just give us the utility function and we have all these cool techniques on how optimizing them. But now we’re saying maybe the optimization part is the easy part and the hard part is deciding what is my utility function. What do we want as a society? What is utility? Utilitarianism is filled with just these kind of endless debates. And as usual when you dive deep absolutes fade away and what remains are shades of grey. As of yet there’s no utility calculus. So if you’re expecting AI to solve life’s big questions it turns out we’ll need to solve them before AI can.
  • You too can use these techniques. Walmart Labs on Here’s What Makes Apache Flink scale:
    • I have been using Apache Flink in production for the last three years, and every time it has managed to excel at any workload that is thrown at it. I have run Flink jobs handling datastream at more than 10 million RPM with not more than 20 cores.
    • Reduce Garbage Collection – Flink takes care of this by managing memory itself.
    • Minimize data transfer – several mapping and filter transformations are done sequentially in a single slot. This chaining minimizes the sharing of data between slots and multiple JVM processes. As a result, jobs have a low network I/O, data transfer latencies, and minimal synchronization between objects.
    • Squeeze your bytes – To avoid storing such heavy objects, Flink implements its serialization algorithm, which is much more space-efficient.
    • Avoid blocking everyone – Flink revamped its network communications after Flink 1.4. This new policy is called credit-based flow control. Receiver sub-tasks announce how many buffers they have left to sender sub-tasks. When a sender becomes aware that a receiver doesn’t have any buffers left, it merely stops sending to that receiver. This helps in preventing the blocking of TCP channels with bytes for the blocking sub-task.
  • A good experience report from The Full Stack Fest Experience 2019
  • Places to intervene in a system: 12. Constants, parameters, numbers (such as subsidies, taxes, standards); 11. The sizes of buffers and other stabilizing stocks, relative to their flows; 10. The structure of material stocks and flows (such as transport networks, population age structures); 9. The lengths of delays, relative to the rate of system change; 8. The strength of negative feedback loops, relative to the impacts they are trying to correct against; 7. The gain around driving positive feedback loops; 6. The structure of information flows (who does and does not have access to information); 5. The rules of the system (such as incentives, punishments, constraints); 4. The power to add, change, evolve, or self-organize system structure; 3. The goals of the system; 2. The mindset or paradigm out of which the system — its goals, structure, rules, delays, parameters — arises; 1. The power to transcend paradigms.
  • The big rewrite can work, but perhaps the biggest lesson is big design up front is almost always a losing strategy. Why we decided to go for the Big Rewrite: We used to be heavily invested into Apache Spark – but we have been Spark-free for six months now…One of our original mistakes (back in 2014) had been that we had tried to “future-proof” our system by trying to predict our future requirements. One of our main reasons for choosing Apache Spark had been its ability to handle very large datasets (larger than what you can fit into memory on a single node) and its ability to distribute computations over a whole cluster of machines4. At the time, we did not have any datasets that were this large. In fact, 5 years later, we still do not…With hindsight, it seems obvious that divining future requirements is a fool’s errand. Prematurely designing systems “for scale” is just another instance of premature optimization…We do not need a distributed file system, Postgres will do…We do not need a distributed compute cluster, a horizontally sharded compute system will do…We do not need a complicated caching system, we can simply cache whole datasets in memory instead…We do not need cluster-wide parallelism, single-machine parallelism will do…We do not need to migrate the storage layer and the compute layer at the same time, we can do one after the other…Avoid feature creep…Test critical assumptions early…Break project up into a dependency tree…Prototype as proof-of-concept…Get new code quickly into production…Opportunistically implement new features…Use black-box testing to ensure identical behavior…Build metrics into the system right from the start…Single-core performance first, parallelism later.
  • Interesting mix of old and new. What is the technology behind Nextdoor in 2019? 
    • Deploying to production 12–15 times. Inserting billions of rows to our Postgres and DynamoDB tables. Handling millions of user sessions concurrently. 
    • Django Framework for web applications; NGINX and uWSGI to serve our Python 3 code, served behind an Amazon Elastic Load Balancer; Conda to manage our Python environments;M yPy to add type safety to the codebase.
    • PostgreSQL is the database. Horizontally scaling uses a combination of application-specific read replicas as well as a connection pooler (PGBouncer); and Load Balancer is used as a custom microservice in front the databases; DynamoDB for documents that need fast retrieval.
    • Memcached and HAProxy help with performance; Redis via ElastiCache is used to use the right data type for the job; CloudFront as the CDN; SQS for job queues.
    • Jobs consumed off SQS using a custom Pythong based distributed job processor called Taskworker. They built a cron type system on top of Taskworker. 
    • Microservices are written in Go and use gorilla/mux as the router. Zookeeper for service configuration. Communicating between services uses a mix of SQS, Apache Thrift and JSON APIs. Storage is mostly DynamoDB. 
    • Most data processing is done via AirFlow, which aggregates PostgreSQL data to S3 that then loads it into Presto.
    • For Machine Learning: Scikit-Learn, Keras, and Tensorflow.
    • Services are deployed as Docker images, using docker-compose for local development, ECS / Kubernetes for prod/staging environments. M
    • Considering moving everything to k8s in the future.
    • Python deployments are done via Nextdoor/conductor, a Go App in charge of continuously releasing our application via Trains -a group of commits to be delivered together. Releases are made using CloudFormation via Nextdoor/Kingpin.
    • React and Redux on the frontend speaking GraphQL and JSON APIs. 
    • PostGIS extension is used for spatial operations using libraries like GDAL and GEOS for spatial algorithms and abstractions, and tools like Mapnik and the Google Maps API to render map data.
    • Currently in the process of developing a brand new data store and custom processing pipeline to manage the high volume of geospatial data expected to store (1B+ rows) as they expand internationally.
  • How LinkedIn customizes Apache Kafka for 7 trillion messages per day
    • At LinkedIn, some larger clusters have more than 140 brokers and host one million replicas in a single cluster. With those large clusters, we experienced issues related to slow controllers and controller failure caused by memory pressure. Such issues have a serious impact on production and may cause cascading controller failure, one after another. We introduced several hotfix patches to mitigate those issues—for example, reducing controller memory footprint by reusing UpdateMetadataRequest objects and avoiding excessive logging.
    • As we increased the number of brokers in a cluster, we also realized that slow startup and shutdown of a broker can cause significant deployment delays for large clusters. This is because we can only take down one broker at a time for deployment to maintain the availability of the Kafka cluster. To address this deployment issue, we added several hotfix patches to reduce startup and shutdown time of a broker (e.g., a patch to improve shutdown time by reducing lock contention). 

Soft Stuff:

  • Hydra (article): a framework for elegantly configuring complex applications. Hydra offers an innovative approach to composing an application’s configuration, allowing changes to a composition through configuration files as well as from the command line.
  • uttpal/clockwork (article): a general purpose distributed job scheduler. It offers you horizontally scalable scheduler with atleast once delivery guarantees. Currently supported task delivery mechanism is kafka, at task execution time the schedule data is pushed to the given kafka topic. 
  • linkedin/kafka: the version of Kafka running at LinkedIn. Kafka was born at LinkedIn. We run thousands of brokers to deliver trillions of messages per day. We run a slightly modified version of Apache Kafka trunk. This branch contains the LinkedIn Kafka release.
  • serverlessunicorn/ServerlessNetworkingClients (article): Serverless Networking adds back the “missing piece” of serverless functions, enabling you to perform distributed computations, high-speed workflows, easy to use async workers, pre-warmed capacity, inter-function file transfers, and much more.

Pub Stuff: 

  • LSST Active Optics System Software Architecture:  In this paper, we describe the design and implementation of the AOS. More particularly, we will focus on the software architecture as well as the AOS interactions with the various subsystems within LSST.
  • Content Moderation for End-to-End Encrypted Messaging: I would like to reemphasize the narrow goal of this paper: demonstrating that forms of content moderation may be technically possible for end-to-end secure messaging apps, and that enabling content moderation is a different problem from enabling law enforcement access to content. I am not yet advocating for or against the protocols that I have described. But I do see enough of a possible path forward to merit further research and discussion.
  • SLOG: Serializable, Low-latency, Geo-replicated Transactions (article): For decades, applications deployed on a world-wide scale have been forced to give up at least one of (1) strict serializability (2) low latency writes (3) high transactional throughput. In this paper we discuss SLOG: a system that avoids this tradeoff for workloads which contain physical region locality in data access. SLOG achieves high-throughput, strictly serializable ACID transactions at geo-replicated distance and scale for all transactions submitted across the world, all the while achieving low latency for transactions that initiate from a location close to the home region for data they access. Experiments find that SLOG can reduce latency by more than an order of magnitude relative to state-of-the-art strictly serializable geo-replicated database systems such as Spanner and Calvin, while maintaining high throughput under contention.
  • FCC-hh: The Hadron Collider: This report contains the description of a novel research infrastructure based on a highest-energy hadron collider with a centre-of-mass collision energy of 100 TeV and an integrated luminosity of at least a factor of 5 larger than the HL-LHC. It will extend the current energy frontier by almost an order of magnitude. The mass reach for direct dis- covery will reach several tens of TeV, and allow, for example, the production of new par- ticles whose existence could be indirectly exposed by precision measurements during the earlier preceding e+e− collider phase.

from High Scalability

Stuff The Internet Says On Scalability For October 4th, 2019

Stuff The Internet Says On Scalability For October 4th, 2019

Wake up! It’s HighScalability time:

SpaceX ready to penetrate space with their super heavy rocket. (announcement)

Do you like this sort of Stuff? I’d greatly appreciate your support on Patreon. And I wrote Explain the Cloud Like I’m 10 for all who want to understand the cloud. On Amazon it has 57 mostly 5 star reviews (135 on Goodreads). Please recommendint it. They’ll love you even more.

Number Stuff: 

  • 94%: lost value in Algorand cryptocurrency in first three months.
  • 38%: increase in machine learning and analytics driven predictive maintenance in manufacturing in the next 5 years.
  • 97.5%: Roku channels tracked using Doubleclick. Nearly all TVs tested contact Netflix, even without a configured Netflix account. Half of TVs talk to tracking services. 
  • 78: SpaceX launches completed in 11 years.
  • 70: countries that have had disinformation campaigns. 
  • 99%: of misconfigurations go unreported in the public cloud.
  • 12 million: reports of illegal images of child sex abuse on Facebook Messenger in 2018.
  • 40%: decrease in trading volume on some major crypto exchanges last month. 
  • 2016: year of peak global smartphone shipments. 
  • 400,000: trees a drone can plant per day. It fires seed missiles into the ground. In less than a year the trees are 20 inches tall.
  • 14 million: Uber rides per day.
  • 90%: large, public game companies (Epic, Ubisoft, Nintendo) run on the AWS cloud. 
  • $370B: public cloud spending by 2022.
  • C: language with the most curse words in the comments.
  • 45 million: DynamoDB TPS. Bitcoin? 14 transactions per second.
  • 50,000: Zoho requests per second.
  • 700 nodes: all it takes for Amazon to cover Los Angeles with a 900 MHz network.
  • One Quadrillion: per day real-time metrics at Datadog.

Quotable Stuff:

  • @Hippotas: The LSM tree is the workhorse for many modern data management systems. Last week we lost Pat O’Neil, one of the inventors of the LSM tree. Pat had great impact in the field of databases (LSM, Escrow xcts, LRU-K, Bitmap Indices, Isolation levels, to name few). He will be missed.
  • @ofnumbers: pro-tip: there is no real point in using a blockchain – much less a licensed “distributed ledger” – for internal use within a silo’ed organization.  marketing that as a big innovative deal is disingenuous.
  • @amcafee: More on this: US digital industries have exploded over the past decade (and GDP has grown by ~25%), yet total electricity use has been ~flat. This is not a coincidence; it’s cause and effect. Digitization makes the entire economy more energy efficient.
  • ICO: GDPR and DPA 2018 strengthened the requirement for organisations to

    report PDBs. As a result, we received 13,840 PDB reports during 2018-19,

    an increase from 3,311 in 2017-18. 

  • Jeanne Whalen: Demand for labeling is exploding in China as large tech companies, banks and others attempt to use AI to improve their products and services. Many of these companies are clustered in big cities like Beijing and Shanghai, but the lower-tech labeling business is spreading some of the new-tech money out to smaller towns, providing jobs beyond agriculture and manufacturing.
  • IDC: Overall, the IT infrastructure industry is at crossing point in terms of product sales to cloud vs. traditional IT environments. In 3Q18, vendor revenues from cloud IT environments climbed over the 50% mark for the first time but fell below this important tipping point since then. In 2Q19, cloud IT environments accounted for 48.4% of vendor revenues. For the full year 2019, spending on cloud IT infrastructure will remain just below the 50% mark at 49.0%. Longer-term, however, IDC expects that spending on cloud IT infrastructure will grow steadily and will sustainably exceed the level of spending on traditional IT infrastructure in 2020 and beyond.
  • Brian Roemmele: I can not overstate enough how important this Echo Network will become. This is Amazon owning the entire stack. Bypassing the ancient cellular network concepts and even the much heralded 5G networks.
  • Ebru Cucen~ Why serverless? Everyone from management to engineering wanted serverless. everyone wanted serverless. It was the first in a project everyone was on-board.
  • Jessica Kerr: Every piece of software and infrastructure that the big company called a capital investment, that they value because they put money into it, that they keep using because it still technically works — all of this weight slows them down.
  • @Obdurodon: Saw someone say recently that bad code crowds out good code because good code is easy to change and bad code isn’t. It’s not just code. (1/2)
  • Paul Nordstrom: spend more time talking to your users about how they would use your system, show your design to more people, you know, just shed the ego and shed this need for secrecy if you can, so that you get a wider spectrum of people who can tell you, I’m gonna use it like this. And then, when you run into the inevitable problem, you know, then you just have to, that having done the work that did before, your system will be cleaner design, you’ll have this mathematical model.
  • @shorgio: Hotels are worse for long term rent prices.  Airbnb keeps hotel profits in check.  Without Airbnb, hotel margins grow so there is an incentive to rezone to build more hotels, which can’t be converted back into actual homes
  • @techreview: In January, WhatsApp limited how often messages can be forwarded—to only five groups instead of 256—in an attempt to slow the spread of disinformation. New research suggests that the change is working.
  • @dhh: The quickest way to ruin the productivity of a small company is to have it adopt the practices of a large company. Small companies don’t just need the mini version of whatever maxi protocol or approach that large companies use. They more often than not need to do the complete opposite.
  • @random_walker: When we watch TV, our TVs watch us back and track our habits. This practice has exploded recently since it hasn’t faced much public scrutiny. But in the last few days, not one but *three* papers have dropped that uncover the extent of tracking on TVs. Let me tell you about them.
  • @lrvick: The WhatsApp backdoor is now public and official. I have said this many times: there is no future for privacy or security tools that are centralized or proprietary. If you can’t decentralize it some government will strongarm you for access.
  • Rahbek: The global pattern of biodiversity shows that mountain biodiversity exhibits a visible signature of past evolutionary processes. Mountains, with their uniquely complex environments and geology, have allowed the continued persistence of ancient species deeply rooted in the tree of life, as well as being cradles where new species have arisen at a much higher rate than in lowland areas, even in areas as amazingly biodiverse as the Amazonian rainforest
  • @mipsytipsy: key honeycomb use cases.  another: “You *could* upgrade your db hardware to a m2.4xl.  Or you could sum up the db write lock time held, break down by app, find the user consuming 92% of all lock time, realize they are on your free tier…and throttle that dude.”
  • Dale Rowe: The internet is designed as a massive distributed network with no single party having total control. Fragmenting the internet (breaking it down into detached networks) would be the more likely result of an attempt. To our knowledge this hasn’t been attempted but one would imagine that some state actors have committed significant research to develop internet kill switches.
  • @cloud_opinion: Although we typically highlight issues with GCP, there are indeed some solid products there – have been super impressed with GKE – its solid, priced right and works great. Give this an A+.
  • David Wootton: This statement seems obvious to us, so we are surprised to discover that the word competition was a new one in Hobbes’ time, as was the idea of a society in which competition is pervasive. In the pre-Hobbesian world, ambition, the desire to get ahead and do better than others, was universally condemned as a vice; in the post-Hobbesian world, it became admirable, a spur to improvement and progress.
  • John Currey: What’s really nice with the randomization is that every node is periodically checking every other node. They’re not checking that particular node so often, but collectively, all the nodes are still checking all of the other nodes. This greatly reduces the chance of a particular node failure not being discovered
  • E-Retail Expansion Report: With over $140 billion in ecommerce sales to consumers in other countries, some U.S. retailers are thinking globally. But only half of U.S. retailers in the Internet Retailer Top 1000 accept online orders from consumers in other countries. The most common way Top 1000 e-retailers sell to shoppers in foreign nations is by accepting orders on their primary websites and then shipping parcels abroad. However, only 46.4% of these retailers ship to the United Kingdom, and 43.4% ship to Japan, two of the largest ecommerce markets. Larger retailers are more likely than smaller ones to ship to foreign addresses, with 70.1% of Top 1000 retailers ranked Nos. 1-100 shipping outside of North America, compared to only 48.4% of those ranked 901-1000
  • Ruth Williams: The finding that a bacterium within a bacterium within an animal cell cooperates with the host on a biosynthetic pathway suggests the endosymbiont is, practically speaking, an organelle.

Useful Stuff: 

  • WhatsApp experiences a connection churn of 600k to 1.5 million connections per second. WhatsApp is famous for using very few servers running Erlang in their core infrastructure. With the 2014 Facebook acquisition a lot has changed, but a lot hasn’t changed too. Seems like they’ve kept that same Erlang spirit. Here’s a WhatsApp update on Scaling Erlang Cluster to 10,000 Nodes
    • Grew from 200m users in 2013 to 1.5 billion in 2018 so they needed more processing power as they add more features and users. In the process they were moving from SoftLayer (IBM, FreeBSD, Erlang R16) to Facebook’s infrastructure (Open Compute, Linux, Erlang R21) after the 2014 acquisition. This required moving from large powerful dual socketed servers to tiny blades with a max of 32 gig of RAM. Facebook’s approach is to pack a lot of servers into a tiny space. Had to move to Erlang R21 to get the networking performance and connection density on Linux that they had on FreeBSD. Now they have a combination of old and new machines in a single cluster and they went from just a few servers to 10,000 smaller Facebook servers. 
    • An Erlang cluster is a mesh. Every node connects to every other node in the cluster. That’s a lot of connections. Not a problem because a million users are assigned to a single server so adding 10,000 connections to a server is not a big deal. They put 1500 nodes in a single cluster with no connection problems. The problem is discovery, when a user on one server talks to another user on a different server. They use two process registries. One is centralized for high rate registrations that acts as a session manager for phones connecting to servers. Every time a phone connects it registers itself in a session manager. A second process registry uses pg2 and globally replicated state for rare changes. A phone connects to an erlang server called a chat node. When a phone wants to connect to another phone it asks a session manager a server the phone is connected to. They have a connection churn of 600k to 1.5 million connections per second. pg2 is used for service discovery mapping which servers to services. Phone numbers are hashed to servers. Meta-cluster are clusters of services: chat, offline, session, contacts, notifications, groups—that are mesh connected as needed. Even with all their patches they can’t scale pg2 to 1500 nodes. Clusters are connected with wandist, a custom service. 
    • It wasn’t easy to move from FreeBSD to Linux, kqueue is awesome and epoll is not as awesome. Erlang R21 supports multiple poll sets so it leverages existing Linux network capabilities. With kqeueu you can update a million file descriptors with a single call. With epoll you would need a million individual kernel calls. Given recent security concerns system calls are not as cheap as you would like them to be.
    • As in 2014 most scalability problems are caused by a lack of concurrency, which means locking bottlenecks. Bottlenecks must be identified and fixed. Routing performance was a problem. Moving to multiple datacenters meant they had to deal with long range communications which added more latency. Some bottlenecks were found and overcome  by adding my concurrency and more workers. Another problem is SSL is really slow on erlang.
    • There are also lots of Erlang bugs they had to fix. The built-in tools are great for fixing problems. First line is using the built-in inspection facilities. For distributed problems they use MSACC – microstate accounting with extra accounting turned on. Lock Counting is a tool to find locks. Since Erlang is open source you can change code to help debugging. 
    • Erlang is getting better so many of the patches they made originally are no longer needed. For example, Erlang introduced off heap messages to reduce garbage collection pressure. But as WhatsApp grows they run into new bottlenecks, like the need for SSL/TLS handshake acceleration. WhatsApp adds more monitoring, statistics, wider lock tables, more concurrency. Some of these patches will go upstream, but many never will. The idea is because Erlang is open source you can make your own version. They are now trying to be more open and push more of their changes upstream.
  • eBay created a 5000 node k8s cluster for their cloud platform. Here’s how they made it workish. Scalability Tuning on a Tess.IO Cluster.
    • To achieve the reliability goal of 99.99%, we deploy five master nodes in a Tess.IO cluster to run Kubernetes core services (apiserver, controller manager, scheduler, etcd, and etcd sidecar, etc). Besides core services, there are also Tess add-ons in each node that expose metrics, set up networks, or collect logs. All of them are watching resources they care about from the cluster control plane, which brings additional loads against the Kubernetes control plane. All the IPs used by the pod network are global routable in the eBay data center. The network agent on each node is in charge of configuring the network on host.
    • There were problems: Failed to recover from failures on cluster with: 5k nodes, 150k pods; Pod scheduling is slow in a large cluster; Large list requests will destroy the cluster; Etcd keeps changing leaders.
    • There were solutions, but it took a lot of work. If you aren’t eBay it might be difficult to pull off.
  • The Evolution of Spotify Home Architecture. This is a common story these days. The move from batch to streaming; the move from running your own infrastructure to moving to the cloud; the move from batch recommendations to real-time recommendations; the move from relative simplicity to greater system complexity; the move from more effort put into infrastructure to more effort being put into product.
    • At Spotify, we have 96 million subscribers, 207 million monthly active users, we’ve paid out over €10 billion to rights holders. There are over 40 million songs on our platform, over 3 billion playlists on our service, and we’re available in 79 markets.
    • We were running a lot of Hadoop jobs back in 2016. We had a big Hadoop cluster, one of the largest in Europe at the time, and we were managing our services and our databases in-house, so we were running a lot of things on-premise. Experimentation in the system can be difficult. Let’s say you have a new idea for a shelf, a new way you want to make a recommendation to a user, there’s a lot in the system that you need to know about to be able to get to an A/B test. There’s also a lot of operational overhead needed to maintain Cassandra and Hadoop. At that time we were running our own Hadoop cluster, we had a team whose job it was just to make sure that that thing was running.
    • We started to adopt services in 2017, this is at the time where Spotify was investing and moving to GCP. What are some of the cons of this? You saw that as we added more and more content, as we added more and more recommendations for the users, it would take longer to load home because we are computing these recommendations at the request spot. We also saw that since we don’t store these recommendations anywhere, if for some reason the request failed, the user would just see nothing on the homepage, that’s a very bad experience.
    • In 2018, Spotify is investing heavily in moving the data stack also to Google Cloud. Today, we’re using a combination of streaming pipelines and services to compute recommendations on home that you see today. What’s the streaming pipeline? We are now updating recommendations based on user events. We are listening to the songs you have listened to, the artists you have followed, and the tracks you have hearted, and we make decisions based on that. We’ve separated out computation of recommendations and serving those recommendations in the system. What are some of the cons? Since we added the streaming pipelines into this ecosystem, the stack has just become a little bit more complex. Debugging is more complicated, if there is an incident on your side, you have to know whether it’s the streaming pipeline, or your service, it’s the logic, or it is because Bigtable is having an issue.
  • Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
    • The key challenge I think for any of us who have worked with S3 is it’s great at this like bulk gerbil storage where you just want the blob back. But for any type of high throughput, definitely with our real-time requirements, S3 in and of itself is not going to ever perform well enough. What this tells us is S3’s going to be great for a long-term gerbil store, but to really scale this, we need to do something faster. That turns into the question of what.
    • Just starting with everyone, let’s just say, everyone who wants to do things fast and is just going in-memory databases today, what does the math of that look like? 300 terabytes, that’s 80 x 1e.32xlarge for a month. That takes it to $300,000 for a month. Now you’re getting into a really expensive system for one customer. This is with no indexes or overhead.
    • we use LevelDB for SSD storage where it’s very high-performing. DRAM is for in-memory. We like Cassandra, where we can trust it to horizontal scaling, more mid-performance. We use Rocks DB and SQLite for when we want a lot of flexibility in types of queries that we want to run.
    • The other thing is, particularly when we’re storing stuff in memory, what we found is that there is no real substitute for just picking the right data structures and just storing them in memory. We use a lot of Go code just using the right indexes and the right clever patterns to store anything else. The key takeaway, this is a lot of data, but what we’ve found is to do this well and at scale, it’s a very hybrid approach to traditional systems.
  • Attention. HTTP/3 is a thing. HTTP/3: the past, the present, and the future
    • Chrome, curl, and Cloudflare, and soon, Mozilla, rolling out experimental but functional, support for HTTP/3 
    • instead of using TCP as the transport layer for the session, it uses QUIC, a new Internet transport protocol, which, among other things, introduces streams as first-class citizens at the transport layer. QUIC streams share the same QUIC connection, so no additional handshakes and slow starts are required to create new ones, but QUIC streams are delivered independently such that in most cases packet loss affecting one stream doesn’t affect others. This is possible because QUIC packets are encapsulated on top of UDP datagrams.
    • QUIC also combines the typical 3-way TCP handshake with TLS 1.3’s handshake. Combining these steps means that encryption and authentication are provided by default, and also enables faster connection establishment. 
  • Processing 40 TB of code from ~10 million projects with a dedicated server and Go for $100:
    • I have 12 million or so git repositories which I need to download and process.
    • This worked brilliantly. However the problem with the above was firstly the cost, and secondly lambda behind API-Gateway/ALB has a 30 second timeout, so it couldn’t process large repositories fast enough. I knew going in that this was not going to be the most cost effective solution but assuming it came close to $100 I would have been willing to live with it. After processing 1 million repositories I checked and the cost was about $60 and since I didn’t want a $700 AWS bill I decided to rethink my solution. 
    • How does one process 10 million JSON files taking up just over 1 TB of disk space in an S3 bucket?
    • The first thought I had was AWS Athena. But since it’s going to cost something like $2.50 USD per query for that dataset I quickly looked for an alternative.
    • My answer to this was another simple go program to pull the files down from S3 then store them in a tar file. I could then process that file over and over. The process itself is done though very ugly go program to process the tar file so I could re-run my questions without having to trawl S3 over and over.
    • However after time I chose not to use AWS in the end because of cost. 
    • So were someone to do this from scratch using the same method I eventually went with it would cost under $100 USD to redo the same calculations
  • Here’s Facebook’s Networking @Scale 2019 recap: “This year’s conference focused on a theme of reliable networking at scale. Speakers talked about various systems and processes they have developed to improve overall network reliability or to detect network outages quickly. They also shared stories about specific network outages, how they were immediately handled, and some general lessons for improving failure resiliency and availability.” You might like: Failing last and least: Design principles for network availability; BGP++ deployment and outages; What we have learned from bootstrapping 1.1.1.1; Operating Facebook’s SD-WAN network; Safe: How AWS prevents and recovers from operational events.
  • 25 Experts Share Their Tips for building Scalable Web Application: Tip #1: Choosing the correct tool with scalability in mind reduces a lot of overhead; Tip #2: Caching comes with a price. Do it only to decrease costs associated with Performance and Scalability; Tip #3: Use Multiple levels of Caching in order to minimize the risk of Cache Miss.
  • Google’s approach requires 38x more bandwidth than a Websocket + delta solution, and delivers latencies that are 25x higher on averageGoogle – polling like it’s the 90s
    • Google strangely chose HTTP Polling.  Don’t confuse this with HTTP long polling where HTTP requests are held open (stalled) until there is an update from the server. Google is literally dumb polling their servers every 10 seconds on the off-chance there’s an update. This is about as blunt a tool as you can imagine.
    • Google’s HTTP polling is 80x less efficient than a raw Websocket solution.  Over a 5 minute window, the total overhead uncompressed is 68KiB vs 16KiB for long polling and a measly 852 bytes for Websockets
    • The average latency for Google’s long polling solution is roughly 25x slower than any streaming transport that could have been used.
    • Every request, every 10 seconds, sends the entire state object…Over a 5 minute window, once the initial state is set up, Google’s polling solution consumes 282KiB of data from the Google servers, whereas using Xdelta (encoded with base-64) over a Websocket transport, only 426 bytes is needed. That represents 677x less bandwidth needed over a 5 minute window, and 30x less bandwidth when including the initial state set up.
  • The 8base tech stack:  we chose Amazon Web Services (AWS) as our computing infrastructure; serverless computing using AWS Lambda; AWS Aurora MySQL and MongoDB Atlas as databases; AWS S3 (Simple Storage Service) for object storage service; AWS’s API Gateway; 8base built an incredibly powerful GraphQL API engine; React; Auth0.
  • The rise of the Ctaic family of programming languages. Altaic: Rise and Fall of a Linguistic Hypothesis. Interesting parallels with the lineage of programming languages. Spoken languages living side by side for long periods of times come to share vocabulary and even grammar, yet they are not part of the same family tree. Words transfer sidewise between unrelated languages rather than having a parent child relationship. I don’t even know if there is formal programming language lineage chart, but it does seem languages tend to converge over time as user populations agitate for the adoption of language features from other languages into their favorite language. Even after years of principled objection to generics being added to Go, many users relentlessly advocate for Go generics. And though C++ is clearly in parent child relationship with C, over the years C++ has adopted nearly every paradigm under the sun.
  • Orchestrating Robot Swarms with Java
    • How does a busy-loop go? Here’s that Real-timeEventScheduler that I didn’t show you before, this time with a busy-loop implemented in it. Similar to our discrete-event, we have our TimeProvider, this time it’s probably the system TimeProvider. I’ve got the interface here and we have our queue of events. Rather than iterating or looping on our queue while we have tasks, we loop around forever and we check, is this event due to scheduled now, or basically has current time gone beyond the time my event history to be scheduled? If it is, then we run the event. Otherwise, we loop around. What this is basically doing is going, “Do I have any work? If I have some work, execute it. If not, loop back around. Do I have any work?” till it finds some work and it does it. Why did we do this? What sort of benefits does this get us? Some of the advantages that we saw is that our latency for individual events went down from basically around 5 milliseconds to effectively 0, because you’re not waiting for anything to wake up, you’re not waiting for a thread to get created, you’re just there constantly polling, and as soon as you’ve got your event, you can execute it. We saw in our system that throughput events went up by three times, that’s quite good for us.
    • we have some parts of our computation which can be precomputed in advance. Our application startup time, we can take all these common parts of calculations and eagerly calculate them and cache the results. What that means is, when we come to communicating with our robots, we don’t have to do full computation in our algorithms, we only have to do the smallest amount of computation based on what the robot is telling us
    • To reduce garbage collection overhead: Remove Optional from APIs that are heavily used us, use for-loops instead of the Streams API, use an array backed data structure instead of something like HashSet or LinkedList and avoid primitive boxing, especially in places like log lines. The thing that these all have in common is basically excess object creation. 
    • ZGC is new in Java 11, labeled as experimental, but it’s promising some seriously low pause times, in order like 10 milliseconds, on heaps of over 100 gigabytes. By just switching to ZGC, that 50 milliseconds is all the way over here, it’s beyond the 99th percentile. That means less than 1 in 100 pauses are greater than 50 milliseconds and for us, that’s amazing. 
  • It would be hard to find a better example of the risks of golden path test and development. New Cars’ Pedestrian-Safety Features Fail in Deadliest Situations. Somewhere in the test matrix should be regression tests for detecting pedestrians at night. We really need a standardized test for every autonomous car software update. It might even be a virtual course. They key is cars are not phones or DVRs. A push based over-the-air update process means a once safe vehicle is one unstable point release away from causing chaos on the road—or the death of a loved one. What if iOS 13 was the software that ran your car? Not a pretty thought.
  • CockroachDB offer the lowest price for OLTP workloads, it does so while offering the highest level of consistency. Just How “Global” Is Amazon Aurora? CockroachDB would like you to know Aurora is not all that. Why? 1) It’s optimized for read-heavy workloads when write scalability can be limited to a single master node in a single region. 2) Replication between regions is asynchronous, there is the potential for data loss of up to a second (i.e., a non-zero recovery point objective – RPO), and up to a minute to upgrade the read node to primary write node (i.e., 1 minute for the recovery time objective – RTO). 3) Multi-master – there is no option to scale reads to an additional region. 4) Multi-master – doubling of the maximum write throughput is gained at the expense of significantly decreased maximum read throughput. 5) Aurora multi-master does not allow for SERIALIZABLE isolation. 6) Depends on a single write node for all global writes. 7) Performs well in a single region and is durable to survive failures of an availability zone. However, there can be latency issues with writes because of the dependence on a single write instance. The distance between the client and the write node will define the write latency as all writes are performed by this node. 8) Does not have the ability to anchor data execution close to data. 
  • Scaling the Hotstar Platform for 50M
    • One of our key insights from 2018 was that auto-scaling would not work, which meant that we had static “ladders” that we stepped up/down to, basis the amount of “headroom” left.
    • Our team took an audacious bet to run 2019 on Kubernetes (K8s). It became possible to think of building our own auto-scaling engine that took into account multiple variables that mattered to our system. 
    • We supported 2x more concurrency in 2019 with 10x less compute overall. This was a 6–8 month journey that had it’s roots in 2–3 months of ideation before we undertook it. This section might make it sound easy, it isn’t.
    • Your system is unique, and it will require a unique solution.
    • We found a system taking up more than 70% compute for a feature that we weren’t even using.
  • We hear a lot about bad crypto, but what does good crypto even look like? Who talks about that? Steve Gibson, that’s who. In the The Joy of Sync Steve describes how Sync.com works and declares it good: Zero-knowledge, end-to-end encryption ● File and file meta data is encrypted client-side and remains encrypted in transit and at rest. ● Web panel, file sharing and share collaboration features are also zero-knowledge. ● Private encryption keys are only accessible by the user, never by Sync. ● Passwords are never transmitted or stored, and are only ever known by the user….A randomly generated 2048 bit RSA private encryption key serves as the basis for all encryption at Sync. During account creation, a unique private key is generated and encrypted with 256 bit AES GCM, locked with the user’s password. This takes place client-side, within the web browser or app. PBKDF2 key stretching with a high iteration count is used to help make weak passwords more cryptographically secure. Encrypted private keys are stored on Sync’s servers, and downloaded and decrypted locally by the desktop app, web panel or mobile apps after successful authentication. At no time does Sync have access to a user’s private key. And there’s much more about how good crypto works.
  • Lyft on Operating Apache Kafka Clusters 24/7 Without A Global Ops Team. Kind of an old school example of how to run your own service and make it reliable for you. Also an example of why the cloud isn’t magic sauce. Things fail and don’t fix themselves. That’s why people pay extra for managed services. But Lyft build the monitoring and repair software themselves and now Kafka runs without a huge operational burden.
  • This could help things actually become the internet of things. Photovoltaic-powered sensors for the “internet of things”: Perovskite [solar] cells, on the other hand, can be printed using easy roll-to-roll manufacturing techniques for a few cents each; made thin, flexible, and transparent; and tuned to harvest energy from any kind of indoor and outdoor lighting. The idea, then, was combining a low-cost power source with low-cost RFID tags, which are battery-free stickers used to monitor billions of products worldwide. The stickers are equipped with tiny, ultra-high-frequency antennas that each cost around three to five cents to make…enough to power up a circuit — about 1.5 volts — and send data around 5 meters every few seconds.

Soft Stuff:

  • Programming with Bigmachine is as if you have a single process with “cloudroutines” across a large cluster of physical nodes. Bigmachine (article): is an attempt to reclaim programmability in cloud computing. Bigmachine is a Go library that lets the user construct a system by writing ordinary Go code in a single, self-contained, monolithic program. This program then manages the necessary compute resources, and distributes itself across them. No infrastructure is required besides credentials to your cloud provider. Bigmachine achieves this by defining a unified programming model: a Bigmachine binary distributes callable services onto abstract “machines”. The Bigmachine library manages how these machines are provisioned, and transparently provides a mutually authenticated RPC mechanism. 
    • @marius: When we built Reflow, we came upon an interesting way to do cluster computing: self-managing processes. The idea was that, instead of using complicated cluster management infrastructure, we could build a vertically integrated compute stack. Because of Reflow’s simplified needs, cluster management could also be a lot simpler, so we built Reflow directly on top of a much lower level interface that could be implemented by EC2 (or really any VM provider) directly. Bigmachine is this idea reified in a Go package. It defines a programming model around the idea of an abstract “machine” that exposes a set of services. The Bigmachine runtime manages machine creation, bootstrapping, and secure RPC. Bigmachine supports any comms topology. Bigmachine also goes to great lengths to provide transparency. For example, standard I/O is sent back to the user; Go’s profile tooling “just works” and gives you profiles that are merged across the whole cluster; stats are automatically aggregated.
    • @everettpberr: It’s hard to believe a gap like this exists in cloud computing today but it’s absolutely there. We deal with this every week. If I have a local program that now I want to execute across a lot of data – there’s _still_ a lot of hassle involved. BigSlice may solve this.
  • Bigslice: a cluster computing system in the style of Spark. Bigslice is a Go library with which users can express high-level transformations of data. These operate on partitioned input which lets the runtime transparently distribute fine-grained operations, and to perform data shuffling across operation boundaries. We use Bigslice in many of our large-scale data processing and machine learning workloads.
    • @marius: Bigslice is a distributed data processing system built on top of Bigmachine. It’s similar to Spark and FlumeJava, but: (1) it’s build for Go; (2) it fully embodies the idea of self-managing serverless computing. We’re using Bigslice for many of our large scale workloads at GRAIL. Because Bigslice is built on top of Bigmachine, it is also fully “self-managing”: the user writes their code, compiles a binary, and runs it. The binary has the capability of transparently distributing itself across a large ad hoc cluster managed by the same runtime. This model of cluster computing has turned out to be very pleasant in practice. It’s easy to make modifications across the stack, and from an operator’s perspective, all you need to do is bring along some cloud credentials. Simplicity and transparency in cloud computing.
  • cloudflare/quiche: an implementation of the QUIC transport protocol and HTTP/3 as specified by the IETF. It provides a low level API for processing QUIC packets and handling connection state.

Pub Stuff:  

  • Numbers limit how accurately digital computers model chaos: Our work shows that the behaviour of the chaotic dynamical systems is richer than any digital computer can capture. Chaos is more commonplace than many people may realise and even for very simple chaotic systems, numbers used by digital computers can lead to errors that are not obvious but can have a big impact. Ultimately, computers can’t simulate everything.
  • The Effects of Mixing Machine Learning and Human Judgment: Considered in tandem, these findings indicate that collaboration between humans and machines does not necessarily lead to better outcomes, and human supervision does not sufficiently address problems when algorithms err or demonstrate concerning biases. If machines are to improve outcomes in the criminal justice system and beyond, future research must further investigate their practical role: an input to human decision makers.

from High Scalability

Redis Cloud Gets Easier with Fully Managed Hosting on Azure

Redis Cloud Gets Easier with Fully Managed Hosting on Azure

Redis Cloud Gets Easier with Fully Managed Hosting on Azure

ScaleGrid, a rapidly growing leader in the Database-as-a-Service (DBaaS) space, has just launched their new fully managed Redis on Azure service. This Redis management solution allows startups up to enterprise-level organizations automate their Redis operations on Microsoft Azure dedicated cloud servers, alongside their other open source database deployments, including MongoDBMySQL and PostgreSQL.

Redis, the #1 key-value store and top 10 database in the world, has grown by over 300% in popularity over that past 5 years, per the DB-Engines knowledge base. The demand for Redis is skyrocketing across dozens of use cases, particularly for cache, queues, geospatial data, and high speed transactions. This simple database management system makes it very easy to store and retrieve pairs of keys and values, and is commonly paired with other database types to increase the speed and performance of an application. According to the 2019 Open Source Database Report, a majority of Redis deployments are used in conjunction with MySQL, and over half of Redis deployments are used with either PostgreSQL, MongoDB, and Elasticsearch.

ScaleGrid’s Redis hosting service allows these organizations to automate all of their time-consuming management tasks, such as backups, upgrades, scaling, replication, sharding, monitoring, alerts, log rotations, and OS patching, so their DBAs, developers, and DevOps teams can focus on new product development and optimizing performance. Additionally, organizations can customize their Redis persistence and host through their own Azure account which allows them to leverage advanced cloud capabilities like Azure Virtual Networks (VNET), Security Groups, and Reserved Instances to reduce long-term hosting costs up to 60%. 

“Cloud reliability has never been so important,” says Dharshan Rangegowda, Founder and CEO of ScaleGrid. “It’s crucial for organizations to properly configure their Redis deployments for high availability and disaster recovery, as a couple minutes of downtime can be detrimental to a company’s security and reputation.”

ScaleGrid is the only Redis cloud service that allows you to customize your master-slave and cross-datacenter configurations for 100% uptime and availability across 30 different Azure regions. They also allow you to keep full Redis admin access and SSH access to your machines, and you can learn more about their advantages over competitors Compose for Redis, RedisGreen, Redis Labs and Elasticache for Redis on their Compare Redis Providers page.

from High Scalability

Sponsored Post: Sisu, Educative, PA File Sight, Etleap, PerfOps, InMemory.Net, Triplebyte, Stream, Scalyr

Sponsored Post: Sisu, Educative, PA File Sight, Etleap, PerfOps, InMemory.Net, Triplebyte, Stream, Scalyr

Who’s Hiring? 

  • Sisu Data is looking for machine learning engineers who are eager to deliver their features end-to-end, from Jupyter notebook to production, and provide actionable insights to businesses based on their first-party, streaming, and structured relational data. Apply here.
  • Triplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Make your job search O(1), not O(n). Apply here.
  • Need excellent people? Advertise your job here! 

Cool Products and Services

  • Grokking the System Design Interview is a popular course on Educative.io (taken by 20,000+ people) that’s widely considered the best System Design interview resource on the Internet. It goes deep into real-world examples, offering detailed explanations and useful pointers on how to improve your approach. There’s also a no questions asked 30-day return policy. Try a free preview today.
  • PA File Sight – Actively protect servers from ransomware, audit file access to see who is deleting files, reading files or moving files, and detect file copy activity from the server. Historical audit reports and real-time alerts are built-in. Try the 30-day free trial!
  • For heads of IT/Engineering responsible for building an analytics infrastructure, Etleap is an ETL solution for creating perfect data pipelines from day one. Unlike older enterprise solutions, Etleap doesn’t require extensive engineering work to set up, maintain, and scale. It automates most ETL setup and maintenance work, and simplifies the rest into 10-minute tasks that analysts can own. Read stories from customers like Okta and PagerDuty, or try Etleap yourself.
  • PerfOps is a data platform that digests real-time performance data for CDN and DNS providers as measured by real users worldwide. Leverage this data across your monitoring efforts and integrate with PerfOps’ other tools such as Alerts, Health Monitors and FlexBalancer – a smart approach to load balancing. FlexBalancer makes it easy to manage traffic between multiple CDN providers, API’s, Databases or any custom endpoint helping you achieve better performance, ensure the availability of services and reduce vendor costs. Creating an account is Free and provides access to the full PerfOps platform.
  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net
  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorialStream is free up to 3 million feed updates so it’s easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we’d like to ad a few zeros to that number. Check out the job opening on AngelList.
  • Scalyr is a lightning-fast log management and operational data platform.  It’s a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services — all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.
  • Advertise your product or service here!

Fun and Informative Events

  • Advertise your event here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.


PA File Sight monitors file access on a server in real-time.

It can track who is accessing what, and with that information can help detect file copying, detect (and stop) ransomware attacks in real-time, and record the file activity for auditing purposes. The collected audit records include user account, target file, the user’s IP address and more. This solution does NOT require Windows Native Auditing, which means there is no performance impact on the server. Join thousands of other satisfied customers by trying PA File Sight for yourself. No sign up is needed for the 30-day fully functional trial.


Make Your Job Search O(1) — not O(n)

Triplebyte is unique because they’re a team of engineers running their own centralized technical assessment. Companies like Apple, Dropbox, Mixpanel, and Instacart now let Triplebyte-recommended engineers skip their own screening steps.

We found that High Scalability readers are about 80% more likely to be in the top bracket of engineering skill.

Take Triplebyte’s multiple-choice quiz (system design and coding questions) to see if they can help you scale your career faster.


The Solution to Your Operational Diagnostics Woes

Scalyr gives you instant visibility of your production systems, helping you turn chaotic logs and system metrics into actionable data at interactive speeds. Don’t be limited by the slow and narrow capabilities of traditional log monitoring tools. View and analyze all your logs and system metrics from multiple sources in one place. Get enterprise-grade functionality with sane pricing and insane performance. Learn more today


If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

from High Scalability

Stuff The Internet Says On Scalability For September 27th, 2019

Stuff The Internet Says On Scalability For September 27th, 2019

Wake up! It’s HighScalability time:

Nifty diagram of what testing looks like in an era or progressive delivery. (@alexsotob, @samnewman)

Do you like this sort of Stuff? I’d greatly appreciate your support on Patreon. I wrote Explain the Cloud Like I’m 10 for all who want to understand the cloud. On Amazon it has 55 mostly 5 star reviews (131 on Goodreads). They’ll thank you for changing their life forever.

Number Stuff:

  • 2: percentage of human DNA coding for genes, so all the extra code in your project is perfectly natural. And 99.9% of your DNA is like any other person’s, so all that duplicate code in your project is also perfectly natural.
  • 40%: do some form of disaster testing annually in production. 
  • 1 billion: Windows 10 devices in 2020. 
  • ~1.5: Bit-Swap better compression than GNU Gzip. 
  • 1 billion: Slack messages sent weekly.
  • 1.4%: decline in electronic equipment salescompared to the same quarter of last year. 
  • 50%: increase year-over-year in enterprise adoption and deployments of multi-cloud. 80%+ of customers on all three clouds use Kubernetes. 1 in 3 enterprises are using serverless in production. AWS Lambda adoption grew to 36% in 2019, up 12% from 2017.
  • $100 million: fund to empower individual creators, galvanize open-standard monetization service providers, and allow users to directly support content they value.
  • #1: most dangerous software error is: Improper Restriction of Operations within the Bounds of a Memory Buffer.
  • $0.030 – $0.035: Backblaze’s target per gigabyte of storage cost.
  • $16.5 billion: record investment in robot sector along with a staggering jump in the number of collaborative robot installations last year.
  • 100,000: free AI generated headshots.
  • 37%: Wi-Fi 6 single-user data rate is faster than 802.11ac.

Quotable Stuff:

  • @cassidoo: My 55-year-old father-in-law has been trying to get a junior coding job for over 2 years and seeing him constantly get rejected for younger candidates at the last interview round is ageism in tech at its finest 😞
  • Robert Wolkow: We’ve built an atomic scale device that’s as disruptive to the transistor and as the transistor was to the vacuum tube. It will change everything. (paper)
  • @gitlab: We learned that @NASA will be flying Kubernetes clusters to the moon  🚀
  • Yunong Shi: The first classical bits used on the giant ENIAC machine are a room of vacuum tubes (around 17000 in total). On average, there was only one tube that fails every two days. On the other hand, for the first generation of qubits we have now, the average lifetime is on the scale of a millisecond to second…after about one hundred to one thousand operations, all qubits are expected to fail
  • @wattersjames: “The latest SQL benchmarks on AWS demonstrated that YSQL is 10x more scalable than the maximum throughput possible with Amazon Aurora. “
  • outworlder: AWS support is stellar. We have workloads on the three major cloud providers, and AWS support is better by an order of magnitude. If anything, this has spoiled us. When things that take minutes to solve (or figure out) on AWS takes days on another cloud provider, no amount of technical wizardry can make up for it. They won’t be dragging your feet just because you have a lower level plan, but if you want to call them to solve stuff right now on the phone or have them showing up at your company with specialists in tow, then you have to fork over the required amount. It’s well spent, IMHO.
  • Andrei Alexandrescu: Speed Is Found In The Minds of People
  • throwsurveill: It took basically until now for the bloom to come off the rose. Think about that. For 20 years Google has been supposedly hiring the smartest guys in the room and all it took was free food, some ball pits and slides, contributing some tech to open source, and working on a handful of “moonshots” that haven’t gone anywhere to keep the sheen of innovation going. And it worked. For 20 years. People have been saying Google is the new Microsoft for a few years but it basically took until now for that to become consensus. Microsoft, who’s been on the back foot until recently, has recast themselves as the new Open Source Champion, basically using the Google playbook from 20 years ago. And it’s working!
  • Jerry Neumann: Moats draw their power to prevent imitation from one of four basic sources: The state, Special know-how, Scale, or System rigidity.
  • QuestionsHurt: I’ve used most hosting setups in my time, shared hosting, dedicated servers, VPS, PaaS like Heroku, EC2 et al., Serverless, and JAMStack like Netlify. Plus other things that sound new but aren’t. I keep coming back to VPS like Digital Ocean, Vultr and the likes. You get more control of the server and more control of your bill. Which is vital to newborn projects.
  • Slack: It’s a race to scale shared channels to fit the needs of our largest customers, some of which have upward of 160,000 active users and more than 5,000 shared channels.
  • @colmmacc: I know that Internet is a great success and all but goddamnit UDP is such a piece of garbage. Even in the 70s, the designers should have had the sense to do fragmentation at layer 4, not layer 3, and put a UDP header in every packet. Don’t get me started on DNS.
  • @elkmovie: N.B.: the highest Geekbench 5 single-core score for *any* Mac is 1262. (2019 iMac 3.6) So the iPhone 11 now offers the fastest single-core performance of any computer Apple has ever made.
  • @anuraggoel: Serverless has its place, but for the love of everything that is holy, please don’t move your whole stack to serverless just because an AWS consultant told you to.
  • @edjgeek: Best practice is not to couple lambda’s in this pattern. For resiliency we recommend SNS/SQS/EventBridge for pub/sub and queueing in serverless. When locally testing, an event from any of these can be mocked for testing via ‘sam local generate-event’ use –help if needed
  • @it4sec: If you plan to fuzz CAN Bus in real vehicle, please make sure that Airbags are disabled.
  • 250bpm: It is said that every year the IQ needed to destroy the world drops by one point. Well, yes, but let me add a different spin on the problem: Every year, the IQ needed to make sense of the world raises by one point. If your IQ is 100 and you want to see yourself in 2039 just ask somebody with IQ 80 and listen carefully.
  • Haowei Yuan: a Dropbox user request goes through before it reaches backend services where application logics are executed. The Global Server Load Balancer (GSLB) distributes a user request to one of our 20+ Point of Presences (PoPs) via DNS. Within each PoP, TCP/IP (layer-4) load balancing determines which layer-7 load balancer (i.e., edge proxies) is used to early-terminate and forward this request to data centers. Inside a data center, Bandaid is a layer-7 load balancing gateway that routes this request to a suitable service…The core concept of our work is to leverage real-time information to make better load balancing decisions. We chose to piggyback the server load information in HTTP responses because it was simple to implement and worked well with our setup. 
  • Marc Greenberg: There needs to be a fundamental shift in what a computer looks like for compute in memory to really take off. There are companies using analog properties of memory cells to do interesting things. Those technologies are still very early in their development, but they’re really interesting. 
  • @QuinnyPig: There’s at least a 60% chance that I could start talking about a fictitious @awscloud MoonBase and I’d be suspected of breaking an NDA somewhere.
  • Jeff Klaus: Globally we are still seeing increasing data center growth. As noted in a recent report, “the seven primary U.S. data center markets saw 171 megawatts (MW) of net absorption in H1 2019, nearly 57 percent of 2018’s full-year record. That absorption nearly eclipsed the 200 MW of capacity added in H1. Northern Virginia, the largest data center market in the world, accounted for 74 percent of net absorption in the primary markets.”
  • Burke Holland: So is the cost of Serverless over-hyped? No. It’s for real. Until you reach a sizeable scale, you’ll pay very little if anything at all. Serverless is one of the most remarkable technologies to come your way in quite some time. Couple that with the automatic infinite scaling and the fact that you don’t even have to deal with a runtime anymore, and this one is a no-brainer.
  • Graham Allan: And 3D stacking for DDR4 and eventually DDR5, as well. And then increased capacity beyond that, you’re taking it on the DIMM and adding all the RC buffers and the data buffers. Registered DIMMS of 3D stacked devices is probably where you’re going to see the sweet spot for DDR5 for very very high capacity requirements. You can get 128 to 256 gigabytes, and maybe 512 gigabytes in the not-to-distant future, in one DIMM card. And that’s just DRAM.
  • Andy Heinig: the ever-increasing expansion of autonomous driving will also place significantly higher demands on integration technology. The data transfer rate between the circuits will be very high because extensive image and radar data is processed, leading to large data quantities per unit of time. Then this data must be processed in the circuits of the chiplets, and therefore regularly exchanged between the circuits. This requires high data rates that can only be realized with fast and massively parallel interfaces, so the corresponding package technology also has to be prepared. Only approaches such as 2.5D integration (interposers) or fan-out technologies can satisfy these requirements.
  • Kurt Shuler: It was also clear from the conference [HotChips] that AI is driving some huge chips, the largest being Cerebras’ 1.2 Trillion transistor 46,225 square mm wafer-sized chip, and that interconnect topologies and more distributed approaches to caching are becoming fundamental to making these designs work with acceptable throughput and power. 
  • JoeAltmaier: Ah, my sister endured all this sort of thing during 20 years as a VP in corporate America. She successfully deployed new data systems to 120 plants in 50 regions in one year. Didn’t cost $25M. Her method? Ruthlessly purge the region of the old data system and install the new (web-based API to a central, new data system). Investigate every regional difference and consolidate into one model. Before deployment day, get all the regional Directors in one room and tell them it was going to happen. Tell them there was no going back, no push-back would be permitted, and have the CEO in the room to confirm this.
  • NoraCodes: The elephant in the room (post?) is that the reason all these open chat protocols are failing is because of deliberate and serious damage done by attack from corporate software companies, especially Facebook and Google. Back in the day, I used XMPP to chat with people from all over the Internet, and so did a lot of my friends, precisely because it was easy to connect with people outside whatever walled garden you used primarily from a single desktop client software. Google and Facebook deliberately killed that model. That’s on them. Same thing with Slack, which had IRC and XMPP gateways for a long time.
  • Maureen Tkacik: Nearly two decades before Boeing’s MCAS system crashed two of the plane-maker’s brand-new 737 MAX jets, Stan Sorscher knew his company’s increasingly toxic mode of operating would create a disaster of some kind. A long and proud “safety culture” was rapidly being replaced, he argued, with “a culture of financial bullshit, a culture of groupthink.”
  • PaulAJ: “Core competencies” is a widely misunderstood term. Lots of people equate it to “business model”, as in “we sell widgets so therefore selling widgets is our core competence”. A thing is a core competence if, and only if: * It makes a difference to your customers. * It is difficult for your competitors to replicate. * It provides access to a wide range of markets.
  • @mweagle: “Guarantees do not necessarily compose into systems.” On Eliminating Error in Distributed Software Systems
  • Erik Brynjolfsson: Artificial intelligence (AI) is advancing rapidly, but productivity growth has been falling for a decade, and real income has stagnated. The most plausible explanation is that it will take considerable time for AI-related technologies to be deployed throughout the economy.
  • Lauren Smiley: The criminal oversights didn’t end there. As Karen’s body was unzipped from the body bag and laid out at the morgue, the coroner took note of a black band still encircling her left wrist: a Fitbit Alta HR—a smartwatch that tracks heartbeat and movement. A judge signed a warrant to extract its data, which seemed to tell the story Karen couldn’t: On Saturday, September 8, five days before she was found, Karen’s heart rate had spiked and then plummeted. By 3:28 in the afternoon, the Fitbit wasn’t registering a heartbeat.
  • DSHR: When HAMR and MAMR finally ship in volume they will initially be around 20% lower $/GB. L2 Drive promises a cost decrease twice as big as the cost decrease the industry has been struggling to deliver for a decade. What is more, their technology is orthogonal to HAMR and MAMR; drives could use both vacuum and HAMR or MAMR in the 2022-3 timeframe, leading to drives with capacities in the 25-28TB range and $/GB perhaps half the current value.
  • Duje Tadin: We often neglect how we get rid of the things that are less important. And oftentimes, I think that’s a more efficient way of dealing with information. If you’re in a noisy room, you can try raising your voice to be heard — or you can try to eliminate the source of the noise.
  • Desire Athow: With an uncompressed capacity of 9TB, it translates into a per TB cost of $6.55, about 12x less than the cheapest SSD on the market and 1/4 the price of the 12TB Seagate Exos X14, currently the most affordable hard disk drive on the market on a per TB basis. In other words, if you want a LOT of capacity, then tape is the obvious answer
  • Michael Graziano: Attention is the main way the brain seizes on information and processes it deeply. To control its roving attention, the brain needs a model, which I call the attention schema. Our attention schema theory explains why people think there is a hard problem of consciousness at all. Efficiency requires the quickest and dirtiest model possible, so the attention schema leaves aside all the little details of signals and neurons and synapses. Instead, the brain describes a simplified version of itself, then reports this as a ghostly, non-physical essence, a magical ability to mentally possess items. Introspection – or cognition accessing internal information – can never return any other answer. It is like a machine stuck in a logic loop. The attention schema is like a self-reflecting mirror: it is the brain’s representation of how the brain represents things, and is a specific example of higher-order thought. In this account, consciousness isn’t so much an illusion as a self-caricature.

Useful Stuff:

  • The equivalent of razor blades in SaaS is paying double for all the services you need to “support” the loss leader service. That’s wrong. 
    • @aripalo: Someone once asked how AWS makes money with #serverless as you don’t pay for idle. I’m glad that someone asked, I can tell you: CloudWatch. One account CW cost 55% because putMetricData. I’ll have to channel my inner @QuinnyPig, start combing through bills & figure out options.
    • @magheru_san: You do pay for idle but in a different way. If your Lambda function responds in single digit milliseconds, you are getting charged for 100ms or >10x than what your function actually consumed. Including if the function is sleeping or waiting for network traffic with an idle CPU
    • @QuinnyPig: …Then CloudWatch gets you again on GetMetric calls when Datadog pulls data in. Then you pay Datadog.
    • @sysproc: Don’t forget the price you pay per month for each unique metric namespace that you then pay more to populate via those PutMetricData calls.
  • Videos from CloudNative London 2019  are now available
  • Fun graph of Moore’s Law vs. actual transistor count along with a lively discussion thread. In 2002 michaelmalak: Not quite. It was the Itanium flop.The x86 instruction set was ugly. Everyone — Intel, Microsoft, software developers — wanted a clean slate. So Intel kind of put x86 development on the back burner and started working on Itanium. Itanium is what you see explode in the graph in 2002, leapfrogging the dreadfully slow Pentium 4 (although it had high transistor count and high clock rate, it was just bad). Despite Microsoft making a version of Windows for Itanium, Itanium was a commercial flop due to lack of x86 backward compatibility (outside of slow emulation).
  • Behold the power of the in-memory cache. Splash the cache: how caching improved our reliability. The problem: a spike in webhook requests caused a backup do to slow DynamoDB look ups. Doubling the provisioned capacity is an insanely expensive temporary workaround and didn’t really solve the problem. Switching to auto provisioning was 7x more expensive. The solution: an in-memory 3 second cache in the publisher. How much difference could 3 seconds make? A lot. They went from 300 reads per second to 1.4 reads per second—200x fewer database reads. And 3 seconds is short enough that when a webhook URL is updated they won’t be inconsistent for long. Why use an in-memory cache than an external cache like Redis? So many reasons: No external dependencies; Minimal failure rate; No runtime errors; In-memory is orders of magnitude faster than any network request.
  • A few videos from AppSec Global DC 2019 are now available.  
  • 10 Things to Consider While Using Spot Instances: Cost savings – customers can realize as much as 90% in cost savings. In fact, IDC has recently estimated that enterprises can save up to $4.85 Million over a period of 5 years by using Spot instances; Business flexibility and growth; Cost vs. Performance; Right-sizing instances for optimal performance; Developer & DevOps productivity; Application architectures; Enterprise grade SLA; Multi-cloud; Integration; Competitive advantage. @QuinnyPig: I have a laundry list of reasons why SaaS companies are a bad fit for meaningful cost optimization. Spotinst avoids nearly all of them. I like what they’re up to. 
  • Serverless: 15% slower and 8x more expensive. CardGames.io runs on AWS using a traditional mix of S3, CloudFront, EC2, Elastic Beanstalk, ELB, EBS, etc. at a cost of $164.21 a month. What happens if you move to a serverless platform? It’s not all cookies and cream. Serverless setup was 15% slower and 8x more expensive: “Our API accepts around 10 million requests a day. That’s ~35$ every day, just for API Gateway. On top of that Lambda was costing around ~10$ a day, although that could maybe be reduced by using less memory. Together that’s around 45$ per day, or 1350$ per month, vs. ~164$ per month for Elastic Beanstalk.” This generated much useful discussion.
    • Hugo Grzeskowiak: If you have constant load on your API, use EC2 or (if containerised) ECS. Choose instances based on the application profile, e.g. CPU, memory or throughput. There’s also instances with ephemeral volumes which are the fastest “hard drives” in case that’s the bottleneck. – If your load is high but fluctuating a lot (e.g. only having traffic from one time zone during rush hours), consider the burstable instances (t3 family). For non customer facing services (backups, batch jobs, orchestration) Lambda is often a good choice. – Lambda can be used for switching Ops costs to AWS costs.
    • Samuel Smith: Though I’ve only moved one off serverless, the general idea is to leave them all there until they find some traction, then when the cost warrants it, convert them
    • Jeremy Cummins: You can use an ALB instead of API Gateway in front of a Lambda. You can also configure an Elastic Beanstalk cluster, ECS cluster, or any other container service to serve as a proxy to serve requests from your lambdas (instead of an ALB or API Gateway). If you are serving your lambda requests through a CDN as you described you can use [email protected] to modify the request responses directly, so no load balancer or proxy needed. 
    • dread_username: Don’t forget to factor in the managed UNIX Administration costs too I think this is the real argument. Fully burdened cost for a senior developer where I live is about US$150000 a year. Given the article number of $1200 a month extra ($16200 a year), if a single developer can leverage serverless for an extra 11% revenue, it’s paid for itself, and the product potentially has more features for the market.
    • marcocom: So, the reason serverless doesn’t work for most is because they don’t truly buy-in to the heavy front-end necessary to run it. They use their old JSP-style approach and that doesn’t fit the philosophy. You have to believe in JavaScript and a server-side comprises of small stupid lambdas that only know their tiny slice of the whole picture and the data they send to the front end to be consumed and persisted by a very smart stateful single-page-application.
    • endproof: Serverless is not meant to run your api or anything with relatively consistent volume. It’s meant to serve for things that have huge spikes in traffic that would be uneconomical to have resources allocated for consistently. Of course it’s slower if they’re dynamically spinning up processes to handle spikes in traffic.
    • TheBigLewinski: Given how the code was running in the first place, directly from a server behind a load balancer, why was the API gateway used? This could have been loaded into a Lambda function and attached to the ALB, as a direct replacement for the Ec2 instances. The author then admits memory simply could have been lowered, but doesn’t provide any more detail. I’m guessing if that level of traffic is currently being handled by a “small” instance, the level of memory per request should be reduced to the bare minimum. But there were no details provided about that. There are billing details on the instances, but for the latter parts, we’ll just have to take their word that it was all properly -and optimally- implemented (And they obviously were not). This is, at best, a lesson on the consequences of haphazard deployments and mindlessly buying into hype. But instead of digging in, to more deeply understand the mechanics of building an app and how to improve, they blamed the technology with a sensationalist headline.
    • abiro: PSA: porting an existing application one-to-one to serverless almost never goes as expected. 1. Don’t use .NET, it has terrible startup time. Lambda is all about zero-cost horizontal scaling, but that doesn’t work if your runtime takes 100 ms+ to initialize. The only valid options for performance sensitive functions are JS, Python and Go. 2. Use managed services whenever possible. You should never handle a login event in Lambda, there is Cognito for that. 3. Think in events instead of REST actions. Think about which events have to hit your API, what can be directly processed by managed services or handled by you at the edge. Eg. never upload an image through a Lamdba function, instead upload it directly to S3 via a signed URL and then have S3 emit a change event to trigger downstream processing. 4. Use GraphQL to pool API requests from the front end. 5. Websockets are cheaper for high throughput APIs. 6. Make extensive use of caching. A request that can be served from cache should never hit Lambda. 7. Always factor in labor savings, especially devops. The web application needs of most startups are fairly trivial and best supported by a serverless stack. Put it another way: If your best choice was Rails or Django 10 years ago, then it’s serverless today.
    • claudiusd: did the same experiment as OP and ran into the same issues, but eventually realized that I was “doing serverless” wrong. “Serverless” is not a replacement for cloud VMs/containers. Migrating your Rails/Express/Flask/.Net/whatever stack over to Lambda/API Gateway is not going to improve performance or costs. You really have to architect your app from the ground-up for serverless by designing single-responsibility microservices that run in separate lambdas, building a heavy javascript front-end in your favorite framework (React/Ember/Amber/etc), and taking advantage of every service you can (Cognito, AppSync, S3, Cloudfront, API Gateway, etc) to eliminate the need for a web framework. I have been experimenting with this approach lately and have been having some success with it, deploying relatively complex, reliable, scalable web services that I can support as a one-man show.
    • danenania: “It’s also great for when you’re first starting out and don’t know when or where you’ll need to scale.” To me this is probably the most significant benefit, and one that many folks in this discussion strangely seem to be ignoring. If you launch a startup and it has some success, it’s likely you’ll run into scaling problems. This is a big, stressful distraction and a serious threat to your customers’ confidence when reliability and uptime suffer. Avoiding all that so you can focus on your product and your business is worth paying a premium for. Infrastructure costs aren’t going to bankrupt you as a startup, but infrastructure that keeps falling over, requires constant fiddling, slows you down, and stresses you out just when you’re starting to claw your way to early traction very well might.
  • This nuanced discussion from Riot Games on the Future of League’s Engine is a situatiion a lot of projects find themselves in. Do they go engine-heavy and move more functionality into C++ or engine-light and move more functionality into scripting? You adopt a scripting language to make your life simpler. Put the high performing core in C++ and bind it to a much easier to use scripting language. Who wants to go through a compile cycle when you can dynamically run code and get stuff done? But eventually you find clean abstractions break down and core functionality is all over the place and it becomes a nightmare to extend and maintain. Riot has consciously choosen to move away from scripting and use more C++: “The reasoning described in this article has been the direction that the Gameplay group on League has been walking for a couple years now. This has lead to some shifts on how we approach projects, for example how the Champions team encapsulated the complexity of Sylas into higher-level constructs and dramatically simplified the script implementation involved. The movement towards engine-heavy and explicitly away from engine-light will provide us with a more secure footing for the increasing complexity of League.” You may still need a scripting layer for desginers and users, but put the effort into making the abstractions in your core easier to use and keep it there.
  • One person’s boring is anothers pit of complexity. The boring technology behind a one-person Internet company. This is a great approach, but it it really that boring? It’s only boring if you already know the technology. Imagine a person just starting having to learn all this “boring” stuff? It would be daunting. It’s only boring because you already know it. The new boring is always being reborn.
  • Root Cause is a Myth: root cause can’t be determined in complex socio-technical systems…Instead of choosing blame and finger-pointing when breaches happen, DevSecOps practitioners should seek shared understanding and following blameless retrospective procedures to look at a wider picture of how the event actually unfolded. We shouldn’t fire the engineer who didn’t apply the patches, nor the CISO who hired the engineer. Instead, we look at what organizational decisions contributed to the breach.
  • It’s always hard to change a fundamental assumption of your architecture. Twitter took a long time to double their character count to 280. Slack is now supports shared channels—A shared channel is one that connects two separate organizations. Yah, it’s a pain to change, but the pain of trying to create an architecture so flexible it has no limits is much greater. Make limits. Optimize around those limits. And stick your tongue out at anyone who bitches about technical debt over a taking pride in a working system. How Slack Built Shared Channels.
    • The backend systems used the boundaries of the workspace as a convenient way to scale the service, by spreading out load among sharded systems. Specifically, when a workspace was created, it was assigned to a specific database shard, messaging server shard, and search service shard. This design allowed Slack to scale horizontally and onboard more customers by adding more server capacity and putting new workspaces onto new servers.
    • We decided to have one copy of the shared channel data and instead route read and write requests to the single shard that hosts a given channel. We used a new database table called shared_channels as a bridge to connect workspaces in a shared channel.
  • This makes sense. Replace limits with smarts. Password Limits on Banks Don’t Matter:  banks aggressively lock out accounts being brute forced. They have to because there’s money at stake and once you have a financial motivator, the value of an account takeover goes up and consequently, so does the incentive to have a red hot go at it. Yes, a 5-digit PIN only gives you 100k attempts, but you’re only allowed two mistakes…Banks typically use customer registration numbers as opposed to user-chosen usernames or email addresses so there goes the value in credential stuffing lists…”Do you really think the only thing the bank does to log people on is to check the username and password?”…implement additional verification processes at key stages of managing your money.
  • HotChips 31 keynote videos are available.
  • From the Critical Watch Report: Encryption-related misconfigurations are the largest group of SMB security issues; In SMB AWS environments, encryption & S3 bucket configuration are a challenge; Weak encryption is a top SMB workload configuration concern; Most unpatched vulnerabilities in the SMB space are more than a year old; The three most popular TCP ports account for 65% of SMB port vulnerabilities; Unsupported Windows versions are rampant in mid-sized businesses; Outdated Linux kernels are present in nearly half of all SMB systems; Active unprotected FTP servers lurk in low-level SMB devices; SMB email servers are old and vulnerable.
  • Update on fsync Performance: In this post, instead of focusing on the performance of various devices, we’ll see what can be done to improve fsync performance using an Intel Optane card…The above results are pretty amazing. The fsync performance is on par with a RAID controller with a write cache, for which I got a rate of 23000/s and is much better than a regular NAND based NVMe card like the Intel PC-3700, able to deliver a fsync rate of 7300/s. Even enabling the full ext4 journal, the rate is still excellent although, as expected, cut by about half.
  • It turns out the decentralized DNS system is actually quite centralized in practice. DNS Resolver Centrality: While 90% of users have a common set of 1.8% of open resolvers and AS resolver sets configured (Figure 4), 90% of users have the entirety of their DNS queries directed to some 2.6% of grouped resolvers. In this case out or some 15M experiments on unique end points, some 592 grouped resolvers out of a total pool of 23,092 such resolver sets completely serve 90% of these 15M end points, and these users direct all their queries to resolvers in these 592 resolver sets. Is this too centralised? Or is it a number of no real concern? Centrality is not a binary condition, and there is no threshold value where a service can be categorised as centralised or distributed. It should be noted that the entire population of Internet endpoints could also be argued to be centralised in some fashion. Out of a total of an estimated 3.6 billion Internet users, 90% of these users appear to be located within 1.2% of networks, or within 780 out of a total number of 65,815 of ASNs advertised in the IPv4 BGP routing system
  • Why is Securing BGP So Hard?: BGP security is a very tough problem. The combination of the loosely coupled decentralized nature of the Internet and a hop-by-hop routing protocol that has limited hooks on which to hang credentials relating to the veracity of the routing information being circulated unite to form a space that resists most conventional forms of security. 
  • So many ways to shoot yourself in the lambda. Serverless Cost Containment: concurrency can bite you by parallelising your failures, enabling you to rack up expenses 1,000 times faster than you thought!; A common error cause I’ve seen in distributed systems is malformed or unexpected messages being passed between systems, causing retry loops; If a Lambda listening to an SQS queue can’t process the message, it returns it to the queue… and then gets given it back again again!; A classic new-to-serverless example is related to loops: an S3 bucket event (or any other Lambda event source) triggers a function that then writes back to the same source, causing an infinite loop; Using messages and queues as a way to decouple your functions is generally a good architectural practice to use; it can also protect you from some cost surprises; Create dashboards to visually monitor for anomalies; Setting a billing alert also serve as a catch-all for other scenarios that you’d want to know about (e.g. being attacked in a way that causes you to consume resources).
  • High autonomy and little hierarchy is a trait tech companies and startups share. Software Architecture is Overrated, Clear and Simple Design is Underrated: Start with the business problem; Brainstorm the approach; Whiteboard your approach; Write it up via simple documentation with simple diagrams; Talk about tradeoffs and alternatives; Circulate the design document within the team/organization and get feedback. 
    • No disagreement with this well written and well thought out article, but the idea that anyone can agree on what is simple and clean in any complex domain is wishful thinking. That’s why there are so many rewrites on projects. New people coming in that were not part of the context that produced the “simple and clean” design so the inevitably don’t understand what’s going on, so they create their new “simple and clean” system. Complexity happens one decision at a time and if you weren’t part of those decisions chances are you won’t understand the resulting code. Prgrammers produce solutions, let’s stop pretending they are simple and clean.
    • gregdoesit: We had a distributed systems expert join the team, who was also a long-time architect. A junior person invited him to review his design proposal on a small subsystem. This experienced engineer kept correcting the junior engineer on how he’s not naming things correctly and mis-using terms. The design was fine and the tradeoffs were good and there was no discussion about anything needing changes there, but the junior engineer came out devastated. He stopped working on this initiative, privately admitting that he feels he’s not experienced enough to design anything this complex and first needs to read books and learn how it’s done “properly”. This person had similar impact on other teams, junior members all becoming dis-engaged from architecture discussions. After we figured out this pattern, we pulled this experienced engineer aside and had a heart to heart on using jargon as a means to prove your smart, opposed to making design accessible to everyone and using it to explain things. I see the pattern of engineers with all background commenting and asking questions on design documents that are simple to read. But ones that are limiting due to jargon that’s not explained in the scope of the document get far less input.
    • And here’s a great explanation of a common at attractor in the chaotic dynamical system that is a company. Why are large companies so difficult to rescue (regarding bad internal technology): There are two big problems that plague rescue efforts at big companies: history and trust…All of which helps explain why technology rescues at bigger, older companies are so difficult. One is constantly fighting against history…To a large extent “be agile” is almost synonymous with “trust each other.” If you’re wondering why large companies have trouble being agile, it is partly because it is impossible for 11,000 people to trust each other the way 5 people can. That is simply reality. Until someone can figure out the magic spell that allows vast groups of people, in different countries, with different cultures, speaking different languages, to all trust each other as if they were good friends, then people in the startup community need to be a lot more careful about how carelessly they recommend that larger companies should be more agile. 
  • Everything You Need To Know About API Rate Limiting. An excellent overview of different methods—request queues, throttling, rate-limiting algorithms—one to add would be use an API Gateway and let them worry about it.
  • The 64 Milliseconds Manifesto: In an interactive software application, any user action SHOULD result in a noticeable change within 16ms, actionable information within 32ms, and at least one full screen of content within 64ms. dahart: Waiting for the response before updating the screen is the wrong answer. It’s not impossible, you’re assuming incorrectly that the manifesto is saying the final result needs to be on screen. It didn’t say that, it said the app needs to respond to action visually, not that the response or interaction sequence must be completed within that time frame. The right answer for web and networked applications is to update the screen with UI that acknowledges the user action and shows the user that the result is pending. Ideally, progress is visible, but that’s tangential to the point of the manifesto. A client can, in fact, almost always respond to actions within these time constraints. The point is to do something, rather than wait for the network response.
  • StackOverflow on why they love love love .NET Core 3.0. The presentation is glitzy, not your normal tech blog post. Can’t wait to see the series on Netflix. Stack Overflow OLD. It’s faster; apps can run on Windows, Macs, Linux, and run in Azure cloud; cloud deploys are easier because there are fewer moving pieces; SO is being broken up into modules that can be run in different areas which allows experimenting with k8s and Docker; most interestingly since they can run in a container they can ship appliances to customers which lowers support costs and makes it easier to onboard customers; they can move code to become middleware; it’s easier to test because they can test end-to-end; since .NET core is on GitHub they can fix errors; they can just build software instead of dealing with the meta of building software. 

Soft Stuff:

  • RSocket (video): Developed by Netifi in collaboration with Netflix, Facebook, Pivotal, Alibaba and others, RSocket combines messaging, stream processing and observability in a single, lightweight solution that provides the connectivity needed for today’s web, mobile and IoT applications. Unlike older technologies such as REST or gRPC, RSocket is equally adept at handling service calls as well as high-throughput streaming data and is at home in the datacenter as well as in the cloud, browsers and mobile/IoT devices.
  • fhkingma/bitswap (article, video): We introduce Bit-Swap, a scalable and effective lossless data compression technique based on deep learning. It extends previous work on practical compression with latent variable models, based on bits-back coding and asymmetric numeral systems. In our experiments Bit-Swap is able to beat benchmark compressors on a highly diverse collection of images. 
  • dgraph-io/ristretto (article):  a fast, concurrent cache library using a TinyLFU admission policy and Sampled LFU eviction policy.

Pub Stuff:

  • Weld: A Common Runtime for High Performance Data Analytics (article): Weld uses a common intermediate representation to capture the structure of diverse dataparallel workloads, including SQL, machine learning and graph analytics. It then performs key data movement optimizations and generates efficient parallel code for the whole workflow.
  • Low-Memory Neural Network Training: A Technical Report: Using appropriate combinations of these techniques, we show that it is possible to the reduce the memory required to train a WideResNet-28-2 on CIFAR-10 by up to 60.7x with a 0.4% loss in accuracy, and reduce the memory required to train a DynamicConv model on IWSLT’14 German to English translation by up to 8.7x with a BLEU score drop of 0.15.
  • Quantum Supremacy Using a Programmable Superconducting Processor: The tantalizing promise of quantum computers is that certain computational tasks might be executed exponentially faster on a quantum processor than on a classical processor. A fundamental challenge is to build a high-fidelity processor capable of running quantum algorithms in an exponentially large computational space. Here, we report using a processor with programmable superconducting qubits to create quantum states on 53 qubits, occupying a state space 253 ∼1016 Measurements from repeated experiments sample the corresponding probability distribution, which we verify using classical simulations. While our processor takes about 200 seconds to sample one instance of the quantum circuit 1 million times, a state-of-the-art supercomputer would require approximately 10,000 years to perform the equivalent task. This dramatic speedup relative to all known classical algorithms provides an experimental realization of quantum supremacy on a computational task and heralds the advent of a much-anticipated computing paradigm.

from High Scalability

Stuff The Internet Says On Scalability For September 20th, 2019

Stuff The Internet Says On Scalability For September 20th, 2019

Wake up! It’s HighScalability time:

Do you like this sort of Stuff? I’d love your support on Patreon. I wrote Explain the Cloud Like I’m 10 for people who need to understand the cloud. And who doesn’t these days? On Amazon it has 54 mostly 5 star reviews (125 on Goodreads). They’ll learn a lot and likely add you to their will.

Number Stuff: 

  • 30 mbits/second: telemetry sent by all F1 cars at all times during a race.
  • ~9 inches: Cerebras Systems’ new chip that completely rethinks the form factor for datacenter computing.
  • 85%: browsers support WebAssembly.
  • $75bn: market value created by enterprise-focused companies this year. 
  • 20TB: Western Digital hard drive designed primarily for write once read many (WORM) applications.
  • 80%: music industry revenue ($4.3 billion) provided by streaming. 
  • $10 million: Facebook funding to create a deep fake detection
  • 1.2 trillion: transistors on a single die (46,225 mm2), for machine learning accelerator chip, the first time in human history.
  • 58%: increase in spending on China’s cloud infrastructure o $2.295bn.
  • 46 days: generative adversarial networks and reinforcement learning examined research and patents to create a potential drug. 
  • $15,000: cost of a drone used to knock out half of Saudi Arabia’s oil supply.
  • 30%: certs for web domains are made by Let’s Encrypt. 
  • 70%: users wrongly identify what a safe URL should look like.
  • 3-4x: faster, on what is effectively a memory bound problem, by interleaving 32 binary searches together.
  • $18,000: cost of 1.5TB of RAM.
  • 130 meters: material was deposited in a day from the asteroid that killed the dinos.
  • 80,000: seeds hidden by Mountain Chickadees to survive winter. 
  • 2nd: place Netflix now holds in the great who uses the most internet bandwidth race. HTTP media streaming is number one. 
  • 3,100: entities to who the Wisconson DMV sold personal data. 
  • 1/3rd: podcast advertising business size compared to those ads you see before a movie. 
  • 400km/second: speed particles travel in space. 

Quotable Stuff:

  • Vint Cerf: Four decades ago, when Bob Kahn and I were creating the TCP/IP networking protocol for the internet, we did not know that we were laying the tracks for what would become the digital superhighway that powers everything in society from modern business to interpersonal relationships. By that same token, we also didn’t envision that people would intentionally take advantage of the network to commit theft and fraud.
  • apenwarr: Absolute scale corrupts absolutely. The pattern is: the cheaper interactions become, the more intensely a system is corrupted. The faster interactions become, the faster the corruption spreads.
  • @cdespinos: For comparison, *two* iPhone 11 Pro phone processors contain more transistors than we shipped in all the 6502s in all the Apple IIs we ever made.
  • @aallan: “People no longer think about their destination being 10km away or 10 stops on the tube. They think about it being 50% of their battery away…” people have always navigated by pubs in Britain, or gas stations in the States. They divide space into concepts.
  • digitalcommerce: With 58% of its $232.9 billion revenue coming from its marketplace, Amazon is the largest online U.S. marketplace. Last year alone, over 25,000 sellers worldwide sold more than $1 million on Amazon, with the average U.S. marketplace merchant selling more than $90,000 on the site.
  • Brenon Daly: Paper may be pricy these days, but it still has value as an M&A currency. So far this year, US public companies have been using their own shares at a near-record rate to pay for the tech deals they are doing.
  • @QuinnyPig: I’m staring at a client bill of ~$120K a month of cross-AZ transfer. Not sold. Not sold at all.
  • mscs: Many people I’ve met who are now saying “I want to put my app in AWS” were saying “I want to put my app in Hadoop” a few years ago with an equal understanding of what it means: zero. The excitement over the shininess of the technology is completely overwhelming the practicality of using it. A key part of my job is to show people how the sausage is made, and give them the slightly more polite version of this talk.
  • Zach_the_Lizard: Not Google, but another Big Tech company. Visibility is very important to getting a promotion at a large company. Selling your work is important. To move up, you must be playing the “choose a good project or team” game for at least 6 months before you try to get promoted. Preferably for a year or more to hit the right checkboxes for multiple cycles. If you fail to do so, you can do absolutely amazing work but rigid processes and evaluation criteria will conspire to defeat you in a promotion committee setting
  • Corey: ConvertKit may want to take a look at what regions they’re in. If they can get by with us-east-1 and us-east-2 as their two regions, cross-region data transfer is half-price between them, as even data wants to get the hell out of Ohio. Capitalize on that.
  • cmcaine: Loop unrolling doesn’t always give you this benefit. This technique is about rewriting an algorithm to be branchless and to separate memory accesses that are dependent on each other with independent work (here by interleaving iterations). Doing both allows the processor to request lots of memory locations in parallel. Unrolling reduces the number of branches but doesn’t necessarily interleave independent memory accesses.
  • @davidgerard: “The difference between BTC and ETH is that BTC has a vaporware second layer scaling solution, while ETH has a vaporware first layer scaling solution.” – SirBender
  • Rasmus Lerdorf: I had absolutely no idea… At every point along the way, I figured there were about six months of life left in PHP. Because that’s about the amount of time I thought it would take for somebody to write something real that could replace it, that also would work for me. And I kept waiting… And nothing did.
  • @sarahmei: 1,000,000% this. If a senior engineer can’t explain it to you in a way you understand – even if you are a lot less experienced – that’s THEIR deficiency.  Not yours. When this happens, it means THEY actually don’t understand it well enough.
  • @mjpt777: You are never sure a design is good until it has been tested by many. Be prepared and eager to learn then adapt. You can never know everything in advance.
  • @davidgerard: The key dumbassery in Facebook’s Libra is:* they spent two years working out their blockchain plan and philosophy; * they spent 0 of this time talking to the financial regulators who could yea or nay their financial product.
  • @channingwalton: “Your team is one of the best performing in the organisation, I will come and see what you’re doing.” “I’ve observed and I’m not happy. You aren’t using the official scrum story format or using Jira. You must do those things properly.”
  • @adrianco: The Netflix zone aware microservices architecture did’t cross zones – 8 years ago. Not sure why people think its a good idea. After LB picks a zone stay there.
  • @taotetek: I’ve complained about config.sys, dot files, ini files, sendmail macros, xml, json, yml, etc throughout my career – name a config format and I’ve complained about it – and what the complaints really were was “configuring this thing is tedious and hard”.
  • @iamdevloper: employee: I want growth in my role company: *installs ping pong table* employee: autonomy? company: *creates fully-stocked snack room* employee: fulfilment. company: *employs a live DJ in the office* employee: *quits* company: some people just aren’t a good culture fit.
  • Taylor Clauson: Recently, when I opened the AWS Console, I had a moment of deja vu that called back to a 2010 post by Andrew Parker, then of Spark Capital, that focused on the startup opportunities of unbundling Craigslist. And it shifted a long-simmering feeling into focus for me: I am really excited about startups that are unbundling AWS. There are obvious differences between Craigslist and AWS. The most important is that Craigslist (and each of the category spawn) is a marketplace, and so has the powerful advantage of network effects. Another distinction is that AWS has relative cost advantages over its unbundlers when it comes to the fundamental components of infrastructure (compute, bandwidth, storage, etc.), and I can’t see a parallel to Craigslist. So it’s not a perfect analogy, but the premise of unbundling certain categories still holds.
  • Mikhail Aleksandrovich Bakunin: In antiquity, slaves were, in all honesty, called slaves. In the middle ages, they took the name of serfs. Nowadays they are called users.
  • dannypgh: I spent 9 years at Google, just left at the end of July. The biggest thing I grew to appreciate was that for iterating large scale production systems, rollout plans are as important as anything else. A very large change may be cost or risk prohibitive to release at once, but with thought you can subdivide the release into easier to rollout and verify subcomponents, you can usually perform the same work as a series of lower risk well understood rollouts. That’s critical enough where it’s worth planning for this from the design phases: much as how you may write software differently to allow for good unit tests, you may want to develop systems differently to allow for good release strategies.
  • jorawebdev: Other than having “Google” on my resume there is nothing special or applicable outside of Google. Most tools are internal, isolated and the choices are restrictive. Management is shitty – micro-management is in full bloom, display lack of management knowledge, skills and there’s plenty of abuse of power. They don’t show their appreciation to what we do. All developers are very competitive. My entire time of over a year in 2 different teams is spent in isolation and self learning without much help or directions. I’m currently actively interviewing outside.
  • mharroun: We spend ~900$ a month on fargate to run our of our dev, stage, qa, and prod environments as well as some other services and sqs consumers. After the recent price decrease we looked at how much reserve instance would save us and the few hundred in savings would not make sense vs the over provisioning and need to dedicate resources to scaling and new tools to monitor individual containers.
  • Laura Nolan: There could be large-scale accidents because these things will start to behave in unexpected ways. Which is why any advanced weapons systems should be subject to meaningful human control, otherwise they have to be banned because they are far too unpredictable and dangerous.
  • bt848: The most important thing I learned there was that you hire and promote people not so management can tell them what to do, but so they can tell management what to do. I didn’t even realize it at the time but the first post-Google job I had there was some clueless manager trying to set the roadmap and I was like “LOL what is this guy’s problem?” That when I started noticing the reasons that Google succeeds.
  • redact207: It’s sad to see this approach being pushed so hard these days. I believe AWS recommends these things so your app becomes so deeply entrenched in all of their services that you’ve locked yourself into them forever. Unfortunately I see a lot of mid and senior devs try to propose these architectures during interviews and when asked questions on integration testing, local environment testing, enforcing contracts, breaking changes, logical refactoring a lot of it falls apart. There’s a lot of pressure for those guys to deliver these sort of highly complex systems and all the DevOps that surround it when they enter a new company or project. Rarely is the project scope or team size considered and huge amounts of time are wasted implementing the bones since they skip the whole monolith stage. Precious few companies are at the size where they can keep an entire Dev team working on a shopping cart component in perpetuity. For most it’s just something that gets worked on for a sprint or two. AWS made a fundamental assumption in this that monoliths and big balls of mud are the same thing. Monoliths can and should be architected with internal logical separation of concerns and loose coupling. Domain driven design helps achieve that. The other assumption that microservices is a fix to this isn’t, because the same poor design can be written but now with a network in between everything.
  • usticezyx: Here a summary of my understanding: * Relentlessly hire the best, train them and give space for them to grow and shine. This is the basis. It’s not that a less quality engineer cannot grow, it’s just too costly to do that at large scale. * Build the infrastructure to support large scale engineering. The best example is the idea of “warehouse scale computer” that is the data center, embodied in systems like Borg Spanner etc. * Relentless consistency at global scale. Example would be the Google C++ style guide. It’s opinionated at its time, but is crucial to ensure a large C++ code base to grow to big size. * Engineering oriented front line management. L6-L8 managers are mostly engineering focused. That’s necessary. * Give the super star the super star treatment. Google’s engineers are rewarded as strategic asset of the company. Numerous example are made public.
  • BAHAR GHOLIPOUR: This would not imply, as Libet had thought, that people’s brains “decide” to move their fingers before they know it. Hardly. Rather, it would mean that the noisy activity in people’s brains sometimes happens to tip the scale if there’s nothing else to base a choice on, saving us from endless indecision when faced with an arbitrary task. The Bereitschaftspotential would be the rising part of the brain fluctuations that tend to coincide with the decisions. This is a highly specific situation, not a general case for all, or even many, choices.
  • Hope Reese: Cavanagh argues that, like bees, humans swarm in sync and change course en masse. She points to the legalization of marijuana, or support for gay marriage, as examples of those tipping points. Public support was “slowly building, but then seemed to, all of a sudden, flip,” she tells OneZero.
  • Mike Isaac: I will say, and I try to get at this point in the book, that a lot of forces came together at the same time to make it possible for something like Uber to exist. And so one could argue that someone else would have done it if not Travis, and to be sure, there were a bunch of competitors at the time. But he did it the biggest out of the many that tried. So you have to give him credit for that.
  • Spotify: At Spotify, one of our engineering strategies is the creation and promotion of the use of “Golden Paths.” Golden Paths are a blessed way to build products at Spotify. They consist of a set of APIs, application frameworks, best practices, and runtime environments that allow Spotify engineers to develop and deploy code safely, securely, and at scale. We complement these with opt-in programs that help increase quality. From our bug bounty program reports, we’ve found that the more that development adheres to a Golden Path, the less likely there is to be a vulnerability reported to us.
  • Steve Cheney: Much breath is wasted over 5G, which is largely a mirage… The real innovation in wireless is happening at the localization and perception levels of the stack. Ultra Wideband gives a 100x increase in fidelity for localization. By using standard handshaking at the physical layer of the 802.15.4z standard, novel properties have emerged for not only localization of devices, but also permission, access control and commerce. In fact, UWB will replace Bluetooth (10x the throughput) and subsume NFC—both transferring of high bandwidth data (phone to glasses) and short range payments will standardize around UWB. 
  • DSHR: The conclusion is that Google’s monopoly is bad for society, but none of the proposed anti-trust remedies address the main reason that it is bad, which is that it is funded by advertising. Even if by anti-trust magic we ended up with three search engines each with 30% of the market, if all three were funded by advertising we’d still be looking at the same advertiser-friendly results. What we need is an anti-trust ruling that says, for example, no search engine with more than 10% market share of US Web search may run any kind of “for pay” content in its result pages, because to do so is a conflict of interest.
  • @rbranson: The thing that nobody talks about with the whole sidecar pattern is how much CPU it burns. If you’re moving a lot of data in/out through the proxies it can be non-trivial. Adding 10-15% to your compute budget is a serious ask.
  • Julian Barbour: People have the idea that the universe started in a special, ordered state, and it’s been getting disordered ever since then. We’re suggesting it’s completely the other way around. The universe, in our view, starts in the most disordered way possible and, at least up to now, it’s been getting ever more interesting. 
  • Tony Albrecht: The dashboards are built with Tableau. The data is fed in from various other sources. We [Riot Games] do track lots of other performance metrics; memory, peak memory, load times, frame spikes, HDD/CPU/GPU specs, and many others. This article is the result of distilling out the data which has no (or very little) impact on performance. We’re regularly adding to that and refining what we have. The filtering by passmark already removes the diurnal effect.
  • tetha: And that’s IMO where the orchestration solutions and containers come in. Ops should provide build chains, the orchestration system and the internal consulting to (responsibly) hand off 80% of that work to developers who know their applications. Orchestration systems like K8, Nomad or Mesos make this much, much easier than classical configuration management solutions. They come with a host of other issues, especially if you have to self-host like we do, no question. Persistence is a bitch, and security the devil. Sure. But I have an entire engineering year already available to setup the right 20% for my 20 applications, and that will easily scale to another 40 – 100 applications as well with some management and care. That’s why we as the ops-team are actually pushing container orchestrations and possibly self-hosted FaaS at my current place.
  • shantly: – If it doesn’t have very different scaling needs from the rest of the application, probably don’t make it a microservice. – If it isn’t something you could plausibly imagine using as a 3rd party service (emailer/sms/push messages, authentication/authorization, payments, image processing, et c.) probably don’t make it a microservice. – If you don’t have at least two applications at least in development that need the same service/functionality, probably don’t make a microservice. – If the rest of your app will completely fall over if this service fails, probably don’t make it a microservice. – Do write anything that resembles the above, but that you don’t actually need to make a microservice yet, as a library with totally decoupled deps from the rest of your program.
  • Gojko Adzic: there is an emerging pattern with Lambda functions to move from a traditional three-tier deployment architecture to a more thick-client approach, even with browsers. In usual three-tier applications, the middle layer deals the business logic, but also with security, workflow orchestration and sessions. Lambda functions are not really suited to session management and long-running orchestration. Business logic stays in Lambda functions, but the other concerns need to go somewhere else. With the AWS platform providing security, a relatively good choice for session management and workflows is to push them to the client completely. So client code is getting thicker and smarter. It’s too early to talk about best practices, because the platform is still rapidly evolving, but I definitely think that this is an interesting trend that needs closer examination. Moving workflow orchestration and sessions to client code from the middle-tier and servers means that application operators do not need to pay for idle work, and wait on background tasks. With a traditional three-tier application, for example, letting a client device write directly to a database, or access background file storage directly is a recipe for disaster and a security nightmare. But because the AWS platform decoupled security from processing, we find ourselves more and more letting client devices talk to traditional back-end resources directly, and set up Lambda functions to just act on events triggered by back-end resources.

Useful Stuff:

  • #326: A chat with Adrian Cockcroft. What happens after chaos engineering becomes the norm? 
    • Chaos engineering won’t be seen as extreme, it will become how you run resilient systems. It will be continuous, productized, and it will be expected for anyone running a highly reliable and highly available system. 
    • Large global companies like Netflix, Uber, Lyft running in a cloud native environment don’t have any scheduled down time. It’s expected they’ll always be available. These architectures are active-active to keep everything up and running all the time. Airlines and finance companies were usually built around a traditional active-passive disaster recovery model approach. They are now looking to shut down all their datacenters, so they’re looking at how to failover into the cloud when something fails.  
    • The US financial industry has regulations and audit requirements stemming from two sources: 9/11 where backup datacenters were on a different floor in the same building and the 2008 financial crisis. The result of these two events causes a whole series of regulations to be created. Some of the larger banks and financial institutions are regulated as strategically important financial institutions (SIFIs). They’re regulated as a group because they are dependent and if they stop bad things would happen to the US and/or world economy. There’s coordinated testing at least once a year where the companies get together and practice failovers between banks. It’s not just one bank testing one application. It’s like a game day at an industry level. As these industries are moving into the cloud AWS has to figure out how to support them. 
    • It’s more about consistency than availability. When someone says something happened it really must have happened. When you say $150 million transferred between banks both banks must agree it happened. It’s ok to say we didn’t do it, try again, as long as they agree.
    • In a large scale active-active system you don’t want any down time, nobody can see it stopping. In that case a little inconsistency is OK. Showing the wrong movies on Netflix doesn’t matter that much.
    • In the financial industry, where consistency is key, you need a budget of time to determine if an operation is consistent. Typically that’s 30 seconds. If something is completely broken that triggers disaster recovery failover to another site. It should take no more than 2 hours to get everything completely back up again. The entire set of applications must be shutdown and restarted properly. You can take 2 hours of down time to failover. 
    • Two ideas are recovery point objective and recovery time objective. Here’s an example of an old style backup/restore type recovery. If you take tape backups every 24 hours and ship them offsite then your recover point is every 24 hours.  If you have a disaster and you need to pull the tapes to the recovery site and get everything running that might take 3 days of recovery time. 
    • With online services there’s typically a 30 second recovery point so you can recover to 30 seconds ago and it should take 2 hours to recover at another site. 
    • You must be able to recover from the loss of a site. The US has three power regions: eastern, western, and southern. These are three independent power grids. There have been cases where a power grid has done down across a whole region. At the two locations where you recover a work load there can be no employees in common. It shouldn’t be possible to commute from one location to another. Preferably sites should be in different parts of the country. In the US you can run resilient loads between the east and west coast. AWS is trying to workout regions in the rest of the world.
    • AWS Outpost can be used as internally allocated region for failing over. With Outpost it’s more complex because you have to figure out the control and data planes. 
    • You have 30 seconds to make sure data in is both sites. Traditionally this is done at the storage tier by block level replication across regions. It’s all very custom, flakey, expensive, and hard to test. They often don’t work well. You have to have these sites for disaster recovery but they aren’t giving you more availability because failing over is so fragile. 
    • The next level up is the database (DynamoDB and Aurora). AWS is hardening these services for financial applications. 30 seconds sounds like a long time, but it’s actually hard to get everything done in that time. You also need to protect against long tail latency so operations complete with the 30 seconds. This is what they are really focussing on as the way to ship data to other locations. 
    • The next level up is the application level. Split traffic into two streams and send it to both sites to be processes in parallel. You have a complete copy that’s fast to switch over too. But it’s expensive and difficult to keep in sync. 
    • It’s very hard to keep datacenters in sync. The cloud is easier because it’s much more programmatic. It’s still too hard, but in the cloud you can make sure both sites are in sync. AWS will productize these disaster recovery patterns as they can.
    • Goal is to be able to test disaster recovery in real-time by setting up multiple regions and continually injecting the failure modes a system should be able to survive and showing it can run in 2 out of 3 zones, testing region failover so you know exactly what it looks like, how it happens, and how to train everybody so once you hit the emergency you know exactly what to do. The best analogy is fire escapes. In case of fire use the fire escape. Buildings have fire drills. It’s a universal training process. We don’t have the equivalent in the cloud and datacenter world to handle emergencies. 
    • Tracing, auditing, and logging feed into disaster recovery because you can have a tamper proof log of everything that happened. All the config. The exact sequence of API calls that created a site. You know the exact versions of everything. And when you failover you can recreate the site using that information and verify everything is exactly the same before failing over. You can prove a system has undergone regular tests by looking at the logs.
  • Videos from always awesome Strange Loop conference are now available.
  • Videos from Chaos Communication Camp 2019 are now available. You might like Fully Open, Fully Sovereign mobile devices which talks about “MEGAphone, which is not only a mobile phone, but also includes UHF packet radio and a modular expansion scheme, that can allow allow the incorporation of satellite and other communications.”
  • Data ghost stories are the scariest. And it’s not even Halloween! Google Has My Dead Grandpa’s Data And He Never Used The Internet
  • If you’re a Claude Shannon fan you’ll probably like this information theory approach to aging. #70 – David Sinclair, Ph.D.: How cellular reprogramming could slow our aging clock. Humans have about 200 different cell types. As embryos all our cells are the same type. Each cell has a full copy of DNA which contains the genes which are used to produce proteins. Cells differentiate because certain genes are turned on and off. You can consider that pattern of gene expression as a program. Over time our cells lose the program of which genes should be turned on and off. Cells lose their identity. The idea is to reset the system to reinstall the cellular software that turns the right genes off and on. You can reprogram cells to regain their youth and identity. Technology used to generate stem cells is used to partially reprogram cells to turn on the youthful pattern of genes that we once had. As we get older our cells do not turn on the genes they did when we were younger. Genes that were on when we were young get switched off. Reprogramming resets that pattern. Genes that were once tightly bundled up by SIR proteins and methelated DNA come unwound as we get older. Reprogramming tells the cell that that region of the genome package that up again and get that gene to switch on or off again, depending on the cell type. The cell keeps its original configuration around. We haven’t lost the genes due to mutation. The information is still there we just don’t access it because the cells don’t know to spool up the DNA and hide it or to expose it to turn on the right genes. 
  • Excellent deck on Taking Serverless to the Next Level
  • For communication to be effective it must have structure. Emails are often a waste of time because they lack intent. Here’s how to fix that. BLUF: The Military Standard That Can Make Your Writing More Powerful. And that’s also why Slack can suck. A thread doesn’t replace the hard work of crafting a clear and precise message. 
  • Great question. Are we safer on-prem or in the cloud? Cloud Risk Surface Report: Cloud consolidation is a thing; the top 5 clouds alone host assets from 75% of organizations. Overall, organizations are over twice as likely to have high or critical exposures in high-value assets hosted in the cloud vs. on-prem. BUT clouds with the lowest exposure rates do twice as well as on-prem. Smokey the Bear was right—only you can prevent cloud fires. Even though we discovered an average 12X difference between clouds with highest and lowest exposure rates, this says more about users than providers. Security in the cloud isn’t on the cloud; it’s on you.
  • Cuckoo Hashing: A new element is always inserted in the first hash table. Should a collision occurr, the existing element is kicked out and inserted in the second hash table. Should that in turn cause a collision, the second existing element will be kicked out and inserted in the first hash table, and so on. This continues until an empty bucket is found.
  • So many ops these days. GitOps. DevOps. NoOps. AiOps. CloudOps. ITOps. SecOps. DevSecOps. It’s all just sys admin to me. 
  • Want to learn about distributed systems? Murat has provided a really long reading list
  • Turns out caching energy has a lot in common with caching data. Ramez Naam on Renewable Energy and an Optimistic Future. Predicts we’ll have a 50/50 split with batteries at a centralized location (solar farm/wind plant) and at the edge. One reason is at the edge (home, office building, mall) if there’s a power outage the power at the edge keeps the power running. Another is your power lines are probably at capacity during peak usage time (4pm – 8pm), but empty at midnight, so batteries can be filled up at midnight and drain down during the day. 
  • Keynote speeches from Scratch Conference (Raspberry Pi) Europe 2019 are now available
  • A career in data science you may not have considered is Data Fabricator. Apparently companies in the generic drug industry pay people to fake data to fool regulators. It’s hard to imagine they are the only ones. I bet it pays well. 
  • For the Azure curious Moving from Lambda ƛ to Azure Functions provides a clear look at what it’s like build functions Azure. Different than Lambda for sure, but not so different. Using HTTP triggers instead of API gateway is a plus. Also, An Introduction to Azure Functions. Also also, My learnings from running the Azure Functions Updates Twitterbot for half a year
  • 15 things I’ve learned doing serverless for a year and a half: Use the Serverless Framework; You can build almost anything; You can cache and use connection pooling; You need to be careful using connection pooling; Lambdas can be very performant if optimized; Converge on one language (JavaScript); Consider services as a collection of lambdas (and other Aws services); Integration test your whole serverless architecture; it is ~1000x cheaper; Iterative deployments make for nice development; Event-driven architecture has some nice advantages; Some things don’t play nice with cloud formation (dns and kms); Use parameter store/secrets manager; Use private packages; Standardize your responses.
  • It’s painful to be an early adopter. You’re constantly in technical debt as you must continually change to the new greatest way of doing things. Neosperience Cloud journey from monolith to serverless microservices with Amazon Web Services: Neosperience Cloud evolved through years from a monolithic architecture to a heterogeneous set of smaller modern applications. Today, our platform counts 17 different business domains, with a total of 5 to 10 microservice each of them, glued by a dozen of support services. Neosperience cloud is multi-tenant, deployed on several AWS accounts, to be able to reserve and partition AWS for each organization (a Neosperience customer). Every deployment includes more than 200 functions and uses more than 400 AWS resources through CloudFormation. Each business domain creates its resources at deploy-time, thus managing their lifecycle through releases. This evolution improved our fitness function under every aspect: from scalability and lifecycle to time to market that shifted from months down to weeks (even days for critical hotfixes). Infrastructure costs shrunk by orders of magnitude. Developers have full control and responsibility for delivery, and innovation is encouraged because failure impacts only a small portion of the codebase.
  • A case study about compression and binary formats for a REST service. Surprise, JSON+LZMA2 is just a touch slower than Protobuf+LZMA2 (though Gzip worked better in production). So maybe you can just “keep it simple stupid.” 
  • When you are managed by an algorithm as eventually will happen, what are likely to be your complaints? What People Hate About Being Managed by Algorithms, According to a Study of Uber Drivers. You won’t like being constantly surveilled. Clearly for an algorithm to function it must have a 360 degree view of your every thought and action. That might be get old. You may chaffe at the lack of transparency. While the algorithm is always looking at you there’s no way for you to look back at the algorithm. It makes decisions, you obey. It’s rule by fiat. Something we purport to hate when a government does it, but accept with glee when a company created algorithm does it. You might feel isolated and alone has you more and more interact via a device instead of through other biomorphs. The future is so bright you might want to buy algorithm mediated shades.
  • In the same way we spend all our time trying optimize garbage collected languages because we think managing memory is too difficult we spend all our time adding back type checking because we think using a typed language is just too hard. Our (Dropbox) journey to type checking 4 million lines of Python.
  • Some ApacheCon 2019 videos are now available.
  • Google’s global scale engineering: Google treats global-scale engineering as one of its core business value, if not the single most critical one…Google is almost only interested in global scale products. Google has been willing to invest heavily on some of world’s most challenging technical problems…Google’s global scape engineering capacity is reflected in several key areas:
    • People management:global-scale engineering demands a global-scale engineering team. Google has more than 40k world-class software engineers, and an equal number of non-technical people
    • Technology: Technology is the foundation, they provide tools for people to collaborate, optimize operations, create new business opportunities, and enable many other innovations. A global engineering organization cannot rely on third party providers. 
    • Operations: How to make the technical infrastructure be utilized fully? How to correctly address short-term and long-term engineering goals and risks? Google pioneered SRE. 
    • Business development: Combining these together, the capability needs to reflect in products that bring actual business value. 
  • Excellent architecture evolution story. Use Serverless AWS step functions to reduce VPC costs: We separated one huge Lambda invocation which operated at the maximum call length (15 mins) into parallel processing. Lambas are paid by the 100 ms so naturally a split of one to five separate Lambdas costs potentially almost an additional 400 ms per invocation. However, each workload can now be downsized to exactly the right resource utilisation in terms of memory and time. Every smaller run is also a tad more reliable in terms of duration (smaller variation) and your memory is quite consistent between runs, which makes for easier tuning. Our biggest payoff was that we could lose the NAT gateway. Which alone accommodates for 500 million Lambda requests of processing (100 ms, 512 mb).
  • Why we chose Flink: To understand why we chose Flink and the features that turned Flink into an absolute breakthrough for us, let’s first discuss our legacy systems and operations. Firstly, before migrating to Flink, the team was heavily reliant on Amazon SQS as a queue processing mechanism. Amazon SQS does not support state so the team had to work off a fragile, in-house-built state that could not support our use case. With the use of Flink’s stateful computations we could easily retain in Flink state the past (arrival) ping, necessary to successfully run the clustering algorithm — something hard to accomplish with the previous technology. Secondly, Flink provides low latency and high throughput computation, something that is a must-have for our application. Prior to the migration, the team was operating on an abstraction layer on top of SQS, executed with Python. Due to Python’s global interpreter lock, concurrency and parallelism are very limited. We tried to increment the processing instances, but having so many instances meant it was very hard to find resources by our container management system PaaSTA that made our deployments take hours to complete. Flink’s low latency, high throughput architecture is a great fit for the time-sensitive and real-time elements of our product.  Finally, Amazon SQS provides at-least-once guarantees that is incompatible with our use case since having duplicate pushes in the output stream could lead to duplicate push notifications to the user resulting in negative user experience. Flink’s exactly-once semantics guarantee that our use case leads to superior user experience as each notification will only be sent once to the user. 
  • UPMEM Processor-in-Memory at HotChips Conference: What is PIM (Processing in Memory) all about?  It’s an approach to improving processing speed by taking advantage of the extraordinary amount of bandwidth available within any memory chip. The concept behind PIM is to build processors right into the DRAM chip and tie them directly to all of those internal bit lines to harness the phenomenal internal bandwidth that a memory chip has to offer.  This is not a new idea.  I first heard of this concept in the 1980s when an inventor approached my then-employer, IDT, with the hopes that we would put a processor into one of our 4Kbit SRAMs!  Even in those days a PIM architecture would have dramatically accelerated graphics processing, which was this inventor’s goal.
  • Mind blowing stuff. Detailed and scary. Unless we rewrite everything in Rust we’ll always be at risk from very clever people working very hard to do very bad things. A very deep dive into iOS Exploit chains found in the wild: Working with TAG, we discovered exploits for a total of fourteen vulnerabilities across the five exploit chains: seven for the iPhone’s web browser, five for the kernel and two separate sandbox escapes. 
  • What a Prehistoric Monument Reveals about the Value of Maintenance. The White Horse of Uffington is a 3000-year-old figure of a football field sized horse etched into a hillside. Without regular maintenance the figure would have faded long ago. People regularly show up to rechalk and maintain the horse. To maintain software systems over time we’ve essentially corporatized ritual, but the motivation is the same.
  • Great list. Choose wisely. 7 mistakes when using Apache Cassandra: When Cassandra works the best? In append-only scenarios, like time-series data or Event Sourcing architecture (e.g. based on Akka Persistence). Be careful, it’s not a general-purpose database.
  • Paper review. Gray Failure: The Achilles’ Heel of Cloud-Scale Systems. Not sure I agree with this one. Observation of distributed system state always suffers from an inherent relativity problem. There’s no such thing as now or 100% working. They also suffer from an emergent complexity problem. There are always paths through the system that can never be anticipated in advance. Kind of like a Godel number. The biological model is you can have skin cancer and the rest of the body keeps right on working.  

Soft Stuff:

  • appwrite/appwrite (article): a simple to use backend for frontend and mobile apps. Appwrite provides client side (and server) developers with a set of REST APIs to speed up their app development times.
  • botslayer: an application that helps track and detect potential manipulation of information spreading on Twitter. The tool is developed by the Observatory on Social Media at Indiana University — the same lab that brought to you Botometer and Hoaxy.
  • Adapton/adapton.rust: A general-purpose Incremental Computation (IC) library for Rust.
  • wepay/waltz (article): quorum-based distributed write-ahead log for replicating transactions.

Pub Stuff:

  • Amazon Web Services’ Approach to Operational Resilience in the Financial Sector & Beyond: The purpose of this paper is to describe how AWS and our customers in the financial services industry achieve operational resilience using AWS services. The primary audience of this paper is organizations with an interest in how AWS and our financial services customers can operate services in the face of constant change, ranging from minor weather events to cyber issues.
  • Data Transfer Project Overview and Fundamentals: The Data Transfer Project (DTP) extends data portability beyond a user’s ability to download a copy of their data from their service provider (“provider”), to providing the user the ability to initiate a direct transfer of their data into and out of any participating provider. The Data Transfer Project is an open source initiative to encourageparticipation of as many providers as possible. 
  • Procella: unifying serving and analytical data at YouTube: The big hairy audacious goal of Procella was to “implement a superset of capabilities required to address all of the four use cases… with high scale and performance, in a single product” (aka HTAP1). That’s hard for many reasons, including the differing trade-offs between throughput and latency that need to be made across the use cases.

from High Scalability

Managing High Availability in PostgreSQL – Part III: Patroni

Managing High Availability in PostgreSQL – Part III: Patroni

Managing High Availability in PostgreSQL – Part III: Patroni - ScaleGrid Blog

In our previous blog posts, we discussed the capabilities and functioning of PostgreSQL Automatic Failover (PAF) by Cluster Labs and Replication Manager (repmgr) by 2ndQuadrant. In the final post of this series, we will review the last solution, Patroni by Zalando, and compare all three at the end so you can determine which high availability framework is best for your PostgreSQL hosting deployment.

Patroni for PostgreSQL

Patroni originated as a fork of Governor, a project from Compose. It is an open-source tool suite, written in Python, for managing high availability of PostgreSQL clusters. Instead of building its own consistency protocol, Patroni smartly leverages the consistency model provided by a Distributed Configuration Store (DCS). It also supports other DCS solutions like Zookeeper, etcd, Consul and Kubernetes.

Patroni ensures the end-to-end setup of PostgreSQL HA clusters, including streaming replication. It supports various ways for creating a standby node, and works like a template that can be customized to your needs.

This feature-rich tool exposes its functionality via REST APIs and also via a command line utility called patronictl. It supports integration with HAProxy by using its health check APIs to handle load balancing.

Patroni also supports event notification with the help of callbacks, which are scripts triggered by certain actions. It enables users to perform any maintenance actions by providing pause/resume functionality. The Watchdog support feature makes the framework even more robust.

How it Works

Initially, PostgreSQL and Patroni binaries needs to be installed. Once this is done, you will also need to setup a HA DCS configuration. All the necessary configurations to bootstrap the cluster needs to be specified in the yaml configuration file and Patroni will use this file for initialization. On the first node, Patroni initializes the database, obtains the leader lock from DCS, and ensures the node is being run as the master.

The next step is adding standby nodes, for which Patroni provides multiple options. By default, Patroni uses pg_basebackup to create the standby node, and also supports custom methods like WAL-E, pgBackRest, Barman and others for the standby node creation. Patroni makes it very simple to add a standby node, and handles all the bootstrapping tasks and setting up of your streaming replication.

Once your cluster setup is complete, Patroni will actively monitor the cluster and ensure it’s in a healthy state. The master node renews the leader lock every ttl second(s) (default: 30 seconds). When the master node fails to renew the leader lock, Patroni triggers an election, and the node which will obtain the leader lock will be elected as the new master.

How Does it Handle the Split Brain Scenario?

In a distributed system, consensus plays an important role in determining consistency, and Patroni uses DCS to attain consensus. Only the node that holds the leader lock can be the master and the leader lock is obtained via DCS. If the master node doesn’t hold the leader lock, then it will be demoted immediately by Patroni to run as a standby. This way, at any point in time, there can only be one master running in the system.

Are There Any Setup Requirements?

  • Patroni needs python 2.7 and above.
  • The DCS and its specific python module must be installed. For test purposes, DCS can be installed on same nodes running PostgreSQL. However, in production, DCS must be installed on separate nodes.
  • The yaml configuration file must be present using these high level configuration settings:

    Global/Universal
    This includes configuration such as the name of the host (name) which needs to be unique for the cluster, the name of the cluster (scope) and path for storing config in DCS (namespace).

    Log
    Patroni-specific log settings including level, format, file_num, file_size etc.

    Bootstrap configuration
    This is the global configuration for a cluster that will be written to DCS. These configuration parameters can be changed with the help of Patroni APIs or directly in DCS.The bootstrap configuration includes standby creation methods, initdb parameters, post initialization script etc. It also contains timeouts configuration, parameters to decide the usage of PostgreSQL features like replication slots, synchronous mode etc.This section will be written into /<namespace>/<scope>/config of a given configuration store after the initializing of new cluster.

    PostgreSQL
    This section contains the PostgreSQL-specific parameters like authentication, directory paths for data, binary and config, listen ip address etc.

    REST API
    This section includes the Patroni-specific configuration related to REST API’s such as listen address, authentication, SSL etc.

    Consul
    Settings specific to Consul DCS.

    Etcd
    Settings specific to Etcd DCS.

    Exhibitor
    Settings specific to Exhibitor DCS.

    Kubernetes
    Settings specific to Kubernetes DCS.

    ZooKeeper
    Settings specific to ZooKeeper DCS.

    Watchdog
    Settings specific to Watchdog.

Patroni Pros

  • Patroni enables end-to-end setup up of the cluster.
  • Supports REST APIs and HAproxy integration.
  • Supports event notifications via callbacks scripts triggered by certain actions.
  • Leverages DCS for consensus.

Patroni Cons

  • Patroni will not detect the misconfiguration of a standby with an unknown or non-existent node in recovery configuration. The node will be shown as a slave even if the standby is running without connecting to the master/cascading standby node.
  • User needs to handle setup, management, and upgrade of the DCS software.
  • Requires multiple ports to be open for components communication:
    • REST API port for Patroni
    • Minimum 2 ports for DCS

High Availability Test Scenarios

We conducted a few tests on PostgreSQL HA management using Patroni. All of these tests were performed while the application was running and inserting data to the PostgreSQL database. The application was written using PostgreSQL Java JDBC Driver leveraging the connection failover capability.

Standby Server Tests

Sl. No Test Scenario Observation
1 Kill the PostgreSQL process Patroni brought the PostgreSQL process back to running state.

  • There was no disruption of the writer application.
2 Stop the PostgreSQL process Patroni brought the PostgreSQL process back to running state.

  • There was no disruption of the writer application.
3 Reboot the server Patroni needs to be started after reboot, unless configured to not start on reboot. Once Patroni was started, it started the PostgreSQL process and setup the standby configuration.

  • There was no disruption of the writer application.
4 Stop the Patroni process
  • It did not stop the PostgreSQL process.
  • patronictl list did not display this server.
  • There was no disruption of the writer application.

So, essentially, you need to monitor the health of the Patroni process – otherwise it will lead to issues down the line.

Master/Primary Server Tests

Sl. No Test Scenario Observation
1 Kill the PostgreSQL process Patroni brought the PostgreSQL process back to running state. Patroni running on that node had primary lock and so the election was not triggered.

  • There was downtime in the writer application.
2 Stop the PostgreSQL process and bring it back immediately after health check expiry Patroni brought the PostgreSQL process back to running state. Patroni running on that node had primary lock and so the election was not triggered.

  • There was downtime in the writer application.
3 Reboot the server Failover happened and one of the standby servers was elected as the new master after obtaining the lock. When Patroni was started on the old master, it brought back the old master up and performed pg_rewind and started following the new master.T

  • There was downtime in the writer application.
4 Stop/Kill the Patroni process
  • One of the standby servers acquired the DCS lock and became the master by promoting itself.
  • The old master was still running and it led to multi-master scenario. The application was still writing to the old master.
  • Once Patroni was started on the old master, it rewound the old master (use_pg_rewind was set to true) to the new master timeline and lsn and started following the new master.

As you can see above, it is very important to monitor the health of the Patroni process on the master. Failure to do so can lead to a multi-master scenario and potential data loss.

Network Isolation Tests

Sl. No Test Scenario Observation
1 Network-isolate the master server from other servers DCS communication was blocked for master node.

  • PostgreSQL was demoted on the master server.
  • A new master was elected in the majority partition.
  • There was a downtime in the writer application.
2 Network-isolate the standby server from other servers DCS communication was blocked for the standby node.

  • The PostgreSQL service was running, however, the node was not considered for elections.
  • There was no disruption in the writer application.

What’s the Best PostgreSQL HA Framework?

Patroni is a valuable tool for PostgreSQL database administrators (DBAs), as it performs end-to-end setup and monitoring of a PostgreSQL cluster. The flexibility of choosing DCS and standby creation is an advantage to the end user, as they can choose the method they are comfortable with.

REST APIs, HaProxy integration, Watchdog support, callbacks and its feature-rich management makes Patroni the best solution for PostgreSQL HA management.

PostgreSQL HA Framework Testing: PAF vs. repmgr vs. Patroni

Included below is a comprehensive table detailing the results of all the tests we have performed on all three frameworks – PostgreSQL Automatic Failover (PAF), Replication Manager (repmgr) and Patroni.

Standby Server Tests

Test Scenario PostgreSQL Automatic Failover (PAF) Replication Manager (repmgr) Patroni
Kill the PostgreSQL process Pacemaker brought the PostgreSQL process back to running state.

  • There was no disruption of the writer application.
Standby server was marked as failed. Manual intervention was required to start the PostgreSQL process again.

  • There was no disruption of the writer application.
Patroni brought the PostgreSQL process back to running state.

  • There was no disruption of the writer application.
Stop the PostgreSQL process Pacemaker brought the PostgreSQL process back to running state.

  • There was no disruption of the writer application.
Standby server was marked as failed. Manual intervention was required to start the PostgreSQL process again.

  • There was no disruption of the writer application.
Patroni brought the PostgreSQL process back to running state.

  • There was no disruption of the writer application.
Reboot the server Standby server was marked offline initially. Once the server came up after reboot, PostgreSQL was started by Pacemaker and the server was marked as online. If fencing was enabled then node wouldn’t have been added automatically to cluster.

  • There was no disruption of the writer application.
Standby server was marked as failed. Once the server came up after reboot, PostgreSQL was started manually and server was marked as running.

  • There was no disruption of the writer application.
Patroni needs to be started after reboot, unless configured to not start on reboot. Once Patroni was started, it started the PostgreSQL process and setup the standby configuration.

  • There was no disruption of the writer application.
Stop the framework agent process Agent: pacemaker

  • The PostgreSQL process was stopped and was marked offline.
  • There was no disruption of the writer application.
Agent: repmgrd

  • The standby server will not be part of automated failover situation.
  • PostgreSQL service was found to be running.
  • There was no disruption of the writer application.
Agent: patroni

  • It did not stop the PostgreSQL process.
  • patronictl list did not display this server.
  • There was no disruption of the writer application.

Master/Primary Server Tests

Test Scenario PostgreSQL Automatic Failover (PAF) Replication Manager (repmgr) Patroni
Kill the PostgreSQL process Pacemaker brought the PostgreSQL process back to running state. Primary got recovered within the threshold time and hence election was not triggered.

  • There was downtime in the writer application.
repmgrd started health check for primary server connection on all standby servers for a fixed interval. When all retries failed, an election was triggered on all the standby servers. As a result of the election, the standby which had the latest received LSN got promoted. The standby servers which lost the election will wait for the notification from the new master node and will follow it once they receive the notification.Manual intervention was required to start the postgreSQL process again.

  • There was downtime in the writer application.
Patroni brought the PostgreSQL process back to running state. Patroni running on that node had primary lock and hence election was not triggered.

  • There was downtime in the writer application.
Stop the PostgreSQL process and bring it back immediately after health check expiry Pacemaker brought the PostgreSQL process back to running state. Primary got recovered within the threshold time and hence election was not triggered.

  • There was downtime in the writer application.
repmgrd started health check for primary server connections on all standby servers for a fixed interval. When all the retries failed, an election was triggered on all the standby nodes. However, the newly elected master didn’t notify the existing standby servers since the old master was back.Cluster was left in an indeterminate state and manual intervention was required.

  • There was downtime in the writer application.
Patroni brought the PostgreSQL process back to running state. Patroni running on that node had primary lock and hence election was not triggered.

  • There was downtime in the writer application.
Reboot the server Election was triggered by Pacemaker after the threshold time for which master was not available. The most eligible standby server was promoted as the new master. Once the old master came up after reboot, it was added back to the cluster as a standby. If fencing was enabled, then node wouldn’t have been added automatically to cluster.

  • There was downtime in the writer application.
repmgrd started election when master connection health check failed on all standby servers. The eligible standby was promoted. When this server came back, it didn’t join the cluster and was marked failed. repmgr node rejoin command was run to add the server back to the cluster.

  • There was downtime in the writer application.
Failover happened and one of the standby servers was elected as the new master after obtaining the lock. When Patroni was started on the old master, it brought back the old master up and performed pg_rewind and started following the new master.

  • There was downtime in the writer application.
Stop the framework agent process Agent: pacemaker

  • The PostgreSQL process was stopped and it was marked offline.
  • Election was triggered and new master was elected.
  • There was downtime in writer application.
Agent: repmgrd

  • The primary server will not be part of the automated failover situation.
  • PostgreSQL service was found to be running.
  • There was no disruption in writer application.
Agent: patroni

  • One of the standby servers acquired the DCS lock and became the master by promoting itself.
  • The old master was still running and it led to multi-master scenario. The application was still writing to the old master.
  • Once Patroni was started on the old master, it rewound the old master (use_pg_rewind was set to true) to the new master timeline and lsn and started following the new master.

Network Isolation Tests

Test Scenario PostgreSQL Automatic Failover (PAF) Replication Manager (repmgr) Patroni
Network isolate the master server from other servers (split brain scenario) Corosync traffic was blocked on the master server.

  • PostgreSQL service was turned off and master server was marked offline due to quorum policy.
  • A new master was elected in the majority partition.
  • There was a downtime in the writer application.
All servers have the same value for location in repmgr configuration:

  • repmgrd started election when master connection health check failed on all standby servers.
  • The eligible standby was promoted, but the PostgreSQL process was still running on the old master node.
  • There were two nodes running as master. Manual intervention was required after the network isolation was corrected.

The standby servers have the same value for location but the primary had a different value for location in repmgr configuration:

  • repmgrd started election when master connection health check failed on all standby servers.
  • But, there was no new master elected since the standby servers had location different from that of the primary.
  • repmgrd went into degrade monitoring mode. PostgreSQL was running on all the nodes and there was only one master in the cluster.
DCS communication was blocked for master node.

  • PostgreSQL was demoted on the master server.
  • A new master was elected in the majority partition.
  • There was a downtime in the writer application.
Network-isolate the standby server from other servers Corosync traffic was blocked on the standby server.

  • The server was marked offline and PostgreSQL service was turned off due to quorum policy.
  • There was no disruption in the writer application.
  • repmgrd went into degrade monitoring mode.
  • The PostgreSQL process was still running on the standby node.
  • Manual intervention was required after the network isolation was corrected.
DCS communication was blocked for the standby node.

  • The PostgreSQL service was running, however, the node was not considered for elections.
  • There was no disruption in the writer application.

from High Scalability

Stuff The Internet Says On Scalability For September 6th, 2019

Stuff The Internet Says On Scalability For September 6th, 2019

Wake up! It’s HighScalability time:

Coolest or most coolest thing ever?

Do you like this sort of Stuff? I’d love your support on Patreon. I wrote Explain the Cloud Like I’m 10 for people who need to understand the cloud. And who doesn’t these days? On Amazon it has 54 mostly 5 star reviews (125 on Goodreads). They’ll learn a lot and likely add you to their will.

Number Stuff:

  • lots: programmers who can’t actually program. 
  • 2x: faster scheduling of jobs across a datacenter using reinforcement learning, a trial-and-error machine-learning technique, to tailor scheduling decisions to specific workloads in specific server clusters. 
  • 300 msecs: time it takes a proposed Whole Foods biometric payment system to scan your hand and process your transaction.
  • $8 million: Slack revenue loss from 2 hours of downtime. (catchpoint email)
  • 8.4 million+: websites participating in Google’s user tracking/data gathering network. It broadcasts personal data about visitors to these sites to 2,000+ companies, hundreds of billions of times a day
  • 20x: BlazingSQL faster than Apache Spark on Google Cloud Platform using NVIDIA’s T4 GPUs by loading data directly into GPU memory using GPU DataFrame (GDF).
  • 405: agencies with access to Ring data. 
  • middle: age at which entrepreneurs are most successful. Youth is not a key trait of successful entrepreneurs. 
  • 5: years until we have carbon nanotube chips in our computers.
  • 5 billion: DVDs shipped by Netflix over 21 years.
  • 51%: chance the world as we know it will not end by 2050.
  • 1,100: US business email compromise scams per month at a cost of $300 million. 

Quotable Stuff:

  • @kennwhite: Merkle trees aren’t gonna fix a low-bid state contractor unpatched 2012 IIS web server
  • Werner Vogels: To succeed in using application development to increase agility and innovation speed, organizations must adopt five elements, in any order: microservices; purpose-built databases; automated software release pipelines; a serverless operational model; and automated, continuous security. The common thing we have seen, though, is that customers who build modern applications see benefits across their entire businesses, especially in how they allocate time and resources. They spend more time on the logic that defines their business, scale up systems to meet peak customer demand easily, increase agility, and deliver new features to market faster and more often.
  • @Carnage4Life: This post points out that rents consume $1 out of every $8 of VC investment in the Bay Area. 
  • @kentonwilliston: Too little, too late. RISC-V has already cornered the “open” core market IMO, and if I wanted a second option it’s hard to see why ‘d go with Power over others like MIPS open
  • echopom: > Why Does Developing on Kubernetes Suck ? IMHO because we are in a phase of transition. Having worked for years in software industry , I’m convinced we are halfway to a much bigger transformation for Software Engineers / SRE , Developers etc… I work in a Neobank ( N26 , Revolut, etc…) , we are currently in the process of re-writing our entire Core Banking System with MicroServices on top of Kubernetes with Kafka. Not a single day pass without having engineers needing to have an exchange about defining basically all of the the terms that exist within the K8/Docker/Kafa world. – What’s a Pod ? How does a pod behave if Kafa goes down ? Do we really need ZooKeeper etc….Their workflows is insanely complex and requires hours if not a day to deploy a single change… obviously let’s not even talk about the amount of work our SRE has in the pipe to “package” the entire stack of 150+ services in K8 through a single YAML file….
  • millerm: I have had this thought for many years. Where is all the perfectly designed, bug free, maintenance-bliss, fully documented, fully tested, future-proofed code located so we can all marvel at its glory?
  • @dvassallo: I agree with the advice. Still, I like these PaaS experiments. There’s a big opportunity for “conceptual compression” on AWS, and I bet one day we’ll see a good PaaS/framework that would be a good choice for the average Twitter for Pets app. And I doubt that would come from AWS.
  • JPL: Atomic clocks combine a quartz crystal oscillator with an ensemble of atoms to achieve greater stability. NASA’s Deep Space Atomic Clock will be off by less than a nanosecond after four days and less than a microsecond (one millionth of a second) after 10 years. This is equivalent to being off by only one second every 10 million years.
  • Nathan Schneider: Pursuing decentralization at the expense of all else is probably futile, and of questionable usefulness as well. The measure of a technology should be its capacity to engender more accountable forms of trust.
  • @tef_ebooks: docker is just static linking for millenials
  • @Hacksterio: “But it’s not until we look at @TensorFlow Lite on the @Raspberry_Pi 4 that we see the real surprise. Here we see a 3X-4X increase in inferencing speed between our original TensorFlow benchmark, and the new results using TensorFlow Lite…”
  • @cmeik: I used to think, and have for many years, that partial failure was the fundamental thing and that’s what needed to be surfaced.  I’m not sure I believe that anymore, I’m starting to think it’s more about uncertainity instead.  But, I don’t know.
  • @benedictevans: Fun with maths: The Moto MC68000 CPU in the original Mac had 68k transistors. Apple sold 372k units in 1984. 68k x 372k=25.3bn The A12X SoC in an iPad Pro has 10bn transistors. So, if you’re inclined to really unfair comparisons: 3 iPads => all Macs sold in the first year
  • atombender: This meme needs to die. Kubernetes is not overkill for non-Google workloads. In my current work, we run several Kubernetes clusters via GKE on Google Cloud Platform. We’re a tiny company — less than 20 nodes running web apps, microservices and search engines — but we’re benefiting hugely from the operational simplicity of Kubernetes. Much, much, much better than the old fleet of Puppet-managed VMs we used to run. Having surveyed the competition (Docker Swarm, Mesos/Marathon, Rancher, Nomad, LXD, etc.), I’m also confident that Kubernetes was the right choice. Kubernetes may be a large and complex project, but the problem it solves is also complex. Its higher-level cluster primitives are vastly better adapted to modern operations than the “simple” Unix model of daemons and SSH and what not. The attraction isn’t just the encapsulation that comes with containers, but the platform that virtualizes physical nodes and allows containers to be treated as ephemeral workloads, along with supporting primitives like persistent volumes, services, ingresses and secrets, and declarative rules like horizontal autoscalers and disruption budgets. Given this platform, you have a “serverless”-like magically scaling machine full of tools at your fingertips. You don’t need a huge workload to benefit from that.
  • cryptica: I’m starting to think that many of the most successful tech companies of the past decade are not real monopolies but succeeded purely because the centralization of capital made it difficult for alternative projects to compete for a limited period of time. Even projects with strong network effects are unlikely to last forever.
  • Code Lime: [Hitting the same database from several microservices] almost refutes the whole philosophy of microservice architecture. They should be independent and self-contained. They should own their data and have complete freedom on how it is persisted. They are abstractions that help de-couple processes. Obviously, they come with a fair amount of overhead for this flexibility. Yet, flexibility is what you should aim for.
  • gervase: When I was running hiring at a previous startup, we ran into this issue often. When I proposed adding FizzBuzz to our screening process, I got a fair amount of pushback from the team that it was a waste of the candidates’ time. Once we’d actually started using it, though, we found it filtered between 20-30% of our applicant pool, even when we let them use literally any language they desired, presumably their strongest. When I was running hiring at a previous startup, we ran into this issue often. When I proposed adding FizzBuzz to our screening process, I got a fair amount of pushback from the team that it was a waste of the candidates’ time. Once we’d actually started using it, though, we found it filtered between 20-30% of our applicant pool, even when we let them use literally any language they desired, presumably their strongest.
  • @jensenharris: There’s no such thing as a “startup inside of a big company.” This misnomer actively misleads both big company employees working in such teams as well as people toiling in actual startups. Despite all best efforts to create megacorp “startups”, they can never exist. Here’s why: 1) The most fundamental, pervasive background thread of an early-stage startup is that when it fails, everyone has find a new job. The company is gone, kaput, relegated to the dustbin of Crunchbase. The company literally lives & dies on the work every employee does every day.
  • @math_rachel: “A company can easily lose sight of its strategy and instead focus strictly on the metrics that are meant to represent it… Wells Fargo never actually had a cross-selling strategy. It had a cross-selling metric.”
  • @ben11kehoe: Don’t put your processed data in the same bucket as the raw ingested data—different lifecycle and backup requirements #sdmel19
  • Lauren Feiner: The proposed solutions focus on removing weaker players from the ecosystem and undermining the hate clusters from within. Johnson and his team suggest that, rather than attacking a highly vocal and powerful player, social media platforms remove smaller clusters and randomly remove individual members. Removing just 10% of members from a hate cluster would cause it to begin to collapse, the researchers say.
  • @mathiasverraes: Philosophy aside, the important questions are, does exposing persisted events have the same practical downsides (in the long term) as exposing state? If so, are there better mitigations? Are the downsides outweighing the upsides? I’m leaning to no, yes, no.
  • @jessitron: “building software isn’t at all like assembling a car. In terms of managing growth, it’s more like raising a child or tending a garden.” @KevinSimler
  • Kevin Simler: In a healthy piece of code, entropic decay is staved off by dozens of tiny interventions — bug fixes, test fixes, small refactors, migrating off a deprecated API
  • streetcat1: First, I must say that the inventors of UML saw it as the last layer. The grand vision was a complete code generation from UML diagrams. And this was the overall grand vision that drove OO in general. I think that this is was happening now with the “low code” startups. 
  • The whole idea is to separate the global decisions (which are hard to change) – e.g. architecture, what classes, what each class do, from the local one (e.g. which data structure to use). So you would use UML for the global decisions, and than make programming the classes almost mechanical.
  • Yegor Bugayenko: Our empirical evidence suggests even expert programmers really learn to program within a given domain. When expert programmers switch domains, they do no better than a novice. Expertise in programming is domain-specific. We can teach students to represent problems in a form the computer could solve in a single domain, but to teach them how to solve in multiple domains is a big-time investment. Our evidence suggests students graduating with a four-year undergraduate degree don’t have that ability. Solving problems with a computer requires skills and knowledge different from solving them without a computer. That’s computational thinking. We will never make the computer completely disappear. The interface between humans and computers will always have a mismatch, and the human will likely have to adapt to the computer to cover that mismatch. But the gap is getting smaller all the time. In the end, maybe there’s not really that much to teach under this definition of computational thinking. Maybe we can just design away the need for computational thinking.
  • @aphyr: If you do this, it makes life so much easier. Strict serializability and linearizability become equivalent properties over histories of txns on maps. If you insist on making the individual r/w micro-ops linearizable, it *breaks* serializability, as we previously discussed.
  • Jennifer Riggins: Datadog itself conducts regular game days where it kills a certain service or dependency to learn what threatens resiliency. These game days are partnerships between the people building whatever’s being tested — as they know best and are initially on-call if it breaks — and a site reliability engineer. This allows the team to test monitoring and alerting, making sure that dashboards are in place and there are runbooks and docs to follow, making sure that the site reliability engineer is equipped to eventually take over.
  • dragonsh: Indeed Uber did try to enter Indonesia they failed, they were out of most of South East and East Asia, because they don’t have enough engineering talent to build a system for those specific countries, local companies like Grab in Singapore, Gojek in Indonesia, Didi in China beat them. So why would you think those companies do not have a talent to build systems better suited to their own environment than Uber.
  • blackoil: We handle peaks of 800k tps in few systems. It is for an analytical platform. Partition in Kafka by some evenly distributed key, Create simple apps that read from a partition and process it, commit offset. Avoid communication between process/threads. Repartition using kafka only. For some cases we had to implement sampling wherein usecase required highly skewed partitions.
  • chihuahua: I was working at Amazon when the 2-pizza team idea was introduced. A week or two later, we though “we’re a 2-pizza team now, let’s order some pizza”. That when we found out that there was no budget for pizza, it was merely a theoretical concept. At the time the annual “morale budget” (for food and other items) was about $5 per person. These days I think the morale budget is a bit higher; in 2013 there were birthday cakes once a month.
  • kator: Another thing that often gets overlooked is the concept of “Single Threaded Owner”. I’m an STO on a topic, that means I write and communicate the known truth and our strategy and plans, I participate in discussions around that topic, I talk to customers about it, I read industry news and leverage my own experience in that topic. Others know me as that STO and reach out to me with related topics if something makes sense to me in my topic area then I try to address it, if not I connect the person with another STO I think would be interested in their idea or problem. Success at Amazon is deeply driven by networking, we have an internal tool called Phonetool which allows you to quickly navigate the company and find people who are close to the topic you have in mind. I keep thinking it’s like the six degrees of separation concept, if somebody doesn’t know the topic they know someone who is closer to the topic, within a couple of emails you are in a conversation with someone on the other side of the company who is passionate, fired up and knows more about the topic than you thought could be known. They’re excited to talk to you about their topic and teach you or learn from your new idea related to their area of focus.
  • Const-me: You know what is a waste of my time? When I wrote a good answer to a question which I think is fine, which then goes to oblivion because some other people, who often have absolutely no clue what’s asked, decide the question is not good enough.
  • Matt Parsons: Names can’t transmit meaning. They can transmit a pointer, though, which might point to some meaning. If that meaning isn’t the right meaning, then the recipient will misunderstand. Misunderstandings like this can be difficult to track down, because our brains don’t give us a type error with a line and column number to look at. Instead, we just feel confused, and we have to dig through our concept graph to figure out what’s missing or wrong.
  • Avast: The findings from the analysis of the obtained snapshot of the C&C server were quite surprising. All of the executable files on the server were infected with the Neshta fileinfector. The authors of Retadup accidentally infected themselves with another malware strain. This only proves a point that we have been trying to make – in good humor – for a long time: malware authors should use robust antivirus protection. 
  • @greenbirdIT: According to 1 study, computational resources required to train large AI models is doubling every three to four months. 
  • @ACLU: Amazon wants to connect doorbell cameras to facial recognition databases, with the ability to call the police if any “suspicious” people are detected.
  • @esh: Kira just discovered the joy of increasing the AWS Lambda MemorySize from the default of 128 to 1792, resulting in the use of a full CPU and a much faster response time. Her Slack command now answers in 2.5 seconds instead of 35 seconds. And the cool thing is that it costs the same to run it faster.
  • @johncutlefish: “We know something is working when we spend longer on it, instead of shorter. Her team was delivering into production daily/weekly. They could have easily bragged about how “quickly” they “move things to done”. But she didn’t”
  • tilolebo: Isn’t it possible to just set the haproxy maxconn to a slightly lower value than what the backend can deal with, and then let the reverse proxy retry once with another backend? Or even queue it for some hundreds of milliseconds before that? This way you avoid overloading backends. Also, haproxy provides tons of real-time metrics, included the highwatermark for concurrent and queued connections.
  • ignoramous: Scheduled tasks are a great way to brown-out your downstream dependencies. In one instance, MDAM RAID checks caused P99 latency spikes first Sunday of every month [0] (default setting). It caused a lot of pain to our customers until the check was IO throttled, which meant spikes weren’t as high, but lasted for a longer time. Scheduled tasks are a great way to brown-out yourself.
  • @sfiscience: “The universe is not a menu. There’s no reason to think it’s full of planets just waiting for humans to turn up. For most of Earth’s history, it hasn’t been comfortable for humans.” – Olivia Judson

Useful Stuff:

  • Always on the lookout for examples from different stacks. Here’s a new power couple. Using Backblaze B2 and Cloudflare Workers for free image hosting. It looks pretty straightforward and even better “Everything I’ve mentioned in the post is 100% free, assuming you stay within reasonable limits.” Backblaze has a 10GB free file limit, and then charges $0.005/GB/Month thereafter. Cloudflare Workers also offers a free tier which includes 100,00 requests every 24 hours, with a maximum of 1,000 requests every 10 minutes. Also, Migrating 23TB from S3 to B2 in just 7 hours
  • Anything C can do Rust can do better. Well, not quite yet. Intel’s Josh Triplett on what would it take for Rust to be comparable to C. Rust is 4 years old. Rust needs full parity with C to support the long tail of systems software. Rust has automatic memory management without GC. Calls to free are inserted by the compiler at compile time. Like C Rust does not have a runtime. Unlike C Rust has safe concurrent programming. The memory safety makes it easier to implment safe concurrency. Rust would have prevented 73% of security bugs in Mozilla. Rust needs better C interoperability. Rust need to improve code size by not linking in used instructions. Needs to support inline assembly. Needs safe SIMD intrinsics. Needs to support bloat16 to minimize storage space and bandwidth for floating point calcs. 
  • Apparently we now need 2FA all the way down. Fraudsters Used AI to Mimic CEO’s Voice in Unusual Cybercrime Case: Criminals used artificial intelligence-based software to impersonate a chief executive’s voice and demand a fraudulent transfer of €220,000 ($243,000) in March in what cybercrime experts described as an unusual case of artificial intelligence being used in hacking.
  • What’s Mars Solar Conjunction, and Why Does It Matter? For 10 days we won’t talk to devices on Mars. Why? “because Mars and Earth will be on opposite sides of the Sun, a period known as Mars solar conjunction. The Sun expels hot, ionized gas from its corona, which extends far into space. During solar conjunction, this gas can interfere with radio signals when engineers try to communicate with spacecraft at Mars, corrupting commands and resulting in unexpected behavior from our deep space explorers. To be safe, engineers hold off on sending commands when Mars disappears far enough behind the Sun’s corona that there’s increased risk of radio interference.”  This period of when commands are not sent called a “command moratorium.” Talk about a maintenance window! This is the kind of thing Delay-tolerant networking has to take into account. Machines will need enough native intelligence to survive without human guiding hands.
  • Ready for a new aaS? iPaaS is integration platform as a service: iPaaS lets you connect anything to anything, in any way, and anywhere. iPaaS works very well in huge enterprise environment that need to integrate a lot of on-premises applications and cloud-based applications or data providers.
  • Living life on the edge. Macquarie Bank replaced 60 EC2 instances with code running at the lambda edge for lower latency and a 80% cost savings. At the edge before a response goes back to the client they inject a few headers: HSTS to require encryption; and X-Frame-Options to prevent pages from being loaded in an iframe and to protect against cross-site scripting attacks. They also validate the JWT token and redirect to a login page if it’s invalid. WAF and Shield are also used for protection. 
  • I had no idea you could and should prune lambda versions. The Dark Side of AWS Lambda: Lambda versions every function. When you couple CI/CD with rapid development and Lambda functions, you get many versions. Hundreds even. And Lambda code storage is limited to 75GB. We hit that limit, and we hit it hard. AWS does allow you to delete specified versions of functions that are no longer in use. 
  • Was Etsy too good to be true? Platforms follow a life cycle: 
    • Most platform users can’t make a living: Though he once dreamed of Etsy sellers making their livings selling things they made themselves, he knows now that was never really what happened for the vast majority. Even when he was CEO and things were small and maybe idyllic, only a fraction of a percentage of sellers were making more than $30,000 a year. 
    • The original platform niche is abandoned as the platform searches for more revenue by broadening its audience: “It’s just a place to sell now,” Topolski says, delineating her personal relationship with the platform that built her business and helped her find the community that makes up much of her world.
    • Platform costs are shifted to users: “I get it, Etsy as a whole needs to be competitive in a marketplace that’s completely shifted toward being convenient,” she tells me. “But it’s a financial issue for people like me whose products are extremely expensive to ship. All of a sudden my items are $10 to $15 more expensive, but I didn’t add any value to justify that pricing.” 
    • User margins become a source of platform profits: before Silverman took over, an Etsy executive told Forbes that more than 50 percent of Etsy’s revenue comes from seller services, like its proprietary payment processing system, which takes a fee of 3 percent, plus 25 cents per US transaction (the company made it the mandatory default option in May, removing the option for sellers to use individual PayPal accounts). New advertising options and customer support features in Etsy Plus — available to sellers willing to pay a $10 monthly fee — expand on that.
    • An edifice complex often signals the end: One moment that sticks out in her mind: a tour of Etsy’s new nine-story, 200,000-square-foot offices in Brooklyn’s Dumbo neighborhood, which opened in the spring of 2016. “I remember immediately getting this sinking feeling that none of it was for us,” she says. It didn’t seem like the type of place she could show up for a casual lunch. It was nice that the building was environmentally-friendly, that it was big and beautiful. It was weird that there was so much more security and less crafting, replaced by the sleek lines of a grown-up startup.
    • Once valuable platform users become just another metric/kpi: “We’re the heart of the company, creating literally all content and revenue,” she says, “and suddenly we weren’t particularly welcome anymore.”
  • koreth: I have been working on an ES/CQRS system for about 4 years and enjoy it…it’s a payment-processing service. 
    • What have the costs been? The message-based model is kind of viral. Ramping up new engineers takes longer, because many of them also have never seen this kind of system before. Debugging gets harder because you can no longer look at a simple stack trace.  I’ve had to explain why I’m spending time hacking on the framework rather than working on the business logic.
    • What have the benefits been? The fact that the inputs and outputs are constrained makes it phenomenally easier to write meaningful, non-brittle black-box unit tests. Having the ability to replay the event log makes it easy to construct new views for efficient querying. Debugging gets easier because you have an audit trail of events and you can often suck the relevant events into a development environment and replay them. Almost nothing had to change in the application code when we went from a single-node-with-hot-standby configuration to a multiple-active-nodes configuration. The audit trail is the source of truth, not a tacked-on extra thing that might be wrong or incomplete. 
  • How long before SSDs replace HDDs? DSHR says a lot longer than you might think. The purchase cost of an HDD is much more than 20% of the power and cooling costs over its service life. So speed isn’t as important as low $/TB. Speed in nearline is nice, but it isn’t what the nearline tier is for. At 5x their cost won’t justify wholesale replacement of the nearline tier. The recent drop in SSD price reflects the transition to 3D flash. The transition to 4D flash is far from imminent, so this is a one-time effect.
  • As soon as you have the concept of a transaction — a group of read and write operations — you need to have rules for what happens during the timeline between the first of the operations of the group and the last of the operations of the group. An explanation of the difference between Isolation levels vs. Consistency levels: Database isolation refers to the ability of a database to allow a transaction to execute as if there are no other concurrently running transactions (even though in reality there can be a large number of concurrently running transactions). The overarching goal is to prevent reads and writes of temporary, incomplete, aborted, or otherwise incorrect data written by concurrent transactions. Database consistency is defined in different ways depending on the context, but when a modern system offers multiple consistency levels, they define consistency in terms of the client view of the database. If two clients can see different states at the same point in time, we say that their view of the database is inconsistent (or, more euphemistically, operating at a “reduced consistency level”). Even if they see the same state, but that state does not reflect writes that are known to have committed previously, their view is inconsistent with the known state of the database. 
  • How We Manage a Million Push Notifications an Hour. Key idea: Each time we found a point which needed to handle multiple implementations of the same core logic, we put it behind a dedicated service: Multiple devices for a user was put behind the token service. Multiple applications were given a common interface on notification server. Multiple providers were handled by individual job queues and notification workers. Also, Rust at OneSignal
  • Having attended more than a few Hadoop meetups this was like reading a young friend was moving into a retirement home. What happened to Hadoop.
    • Something happened within the big data world to erode Hadoop’s foundation of a distributed file system (HDFS) coupled with a compute engine for running MapReduce (the original Hadoop programming model) jobs: 
      • Mobile phones became smartphones and began generating streams of real-time data.
      • Companies were reminded that they had already invested untold billions in relational database and data warehouse technologies
      • Competitive or, at least, alternative projects such as Apache Spark began to spring up from companies, universities, and web companies trying to push Hadoop, and the whole idea of big data, beyond its early limitations.
      • Venture capital flowed into big data startups. 
      • Open source, now very much in the mainstream of enterprise IT, was getting better
      • Cloud computing took over the world, making it easier not just to virtually provision servers, but also store data cheaply and to use managed services that tackle specific use case
      • Docker and Kubernetes were born. Together, they opened people’s eyes to a new way of packaging and managing applications and infrastructure
      • Microservices became the de facto architecture for modern applications
    • What are the new trends?
      • Streaming data and event-driven architectures are rising in popularity. 
      • Apache Kafka is becoming the nervous system for more data architectures.
      • Cloud computing dominates infrastructure, storage, and data-analysis and AI services.
      • Relational databases — including data warehouses — are not going anywhere,
      • Kubernetes is becoming the default orchestration layer for everything,
  • 6 Lessons we learned when debugging a scaling problem on GitLab.com: But the biggest lesson is that when large numbers of people schedule jobs at round numbers on the clock, it leads to really interesting scaling problems for centralized service providers like GitLab. If you’re one of them, you might like to consider putting in a random sleep of maybe 30 seconds at the start, or pick a random time during the hour and put in the random sleep, just to be polite and fight the tyranny of the clock.
  • Federated GraphQL Server at Scale: Zillow Rental Manager Real-time Chat Application: We share how we try to achieve developer productivity and synergy between different teams by having a federated GraphQL server…we decided to go with a full-fledged GraphQL, Node, React-Typescript application which would form the frontend part of the satellite…Both, Rental Manager and Renter Hub talk to Satellite GraphQl server (express-graphql server) which maps requests to the appropriate endpoint in Satellite API after passing through the authentication service for each module…We implemented a layered approach where each module houses multiple features and each feature has its own schema, resolvers, tests, and services. This strategy allows us to isolate each feature into its own folder and then stitch everything together at the root of our server. Each feature has its own schema and is written in a file with .graphql extension so that we can leverage all the developer tooling around GraphQl. 

Soft Stuff:

  • cloudstateio/cloudstate: a standards effort defining a specification, protocol, and reference implementation, aiming to extend the promise of Serverless and its Developer Experience to general-purpose application development. CloudState builds on and extends the traditional stateless FaaS model, by adding support for long-lived addressable stateful services and a way of accessing mapped well-formed data via gRPC, while allowing for a range of different consistency model—from strong to eventual consistency—based on the nature of the data and how it should be processed, managed, and stored.
  • TULIPP (article): makes it possible to develop energy-efficient embedded image processing systems more quickly and less expensively, with a drastic reduction in time-to-market. The results are impressive: The processing, which originally took several seconds to analyze a single image on a high-end PC, can now run on the drone in real time, i.e. now approximately 30 images are analyzed per second. “The speed of pedestrian detection algorithm could be increased by a factor of 100: Now the system can analyze 14 images per second compared to one image every seven seconds. Enhancement of X-ray image quality by applying noise-removing image filters allowed reducing the intensity of radiation during surgical operations to one fourth of the previous level. At the same time energy consumption could be significantly reduced for all three applications.

Pub Stuff:

  • A link layer protocol for quantum networks: Here, we take the first step from a physics experiment to a quantum internet system. We propose a functional allocation of a quantum network stack, and construct the first physical and link layer protocols that turn ad-hoc physics experiments producing heralded entanglement between quantum processors into a well-defined and robust service. This lays the groundwork for designing and implementing scalable control and application protocols in platform-independent software.
  • How Chemistry Computes: Language Recognition by Non-Biochemical Chemical Automata. From Finite Automata to Turing Machines:  Our Turing machine uses the Belousov-Zhabotinsky chemical reaction and checks the same symbol in an Avogadro′s number of processors. Our findings have implications for chemical and general computing, artificial intelligence, bioengineering, the study of the origin and presence of life on other planets, and for artificial biology.
  • Choosing a cloud DBMS: architectures and tradeoffs: My key takeaways as a TL;DR: Store your data in S3; Use portable data format that gives you future flexibility to process it with multiple different systems (e.g. ORC or Parquet); Use Athena for workloads it can support (Athena could not run 4 of the 22 TPC-H queries, and Spectrum could not run 2 of them), especially if you are doing less frequent ad-hoc queries.
  • The Art Of PostgreSQL: is the new edition of my previous release, Mastering PostgreSQL in Application Development. It contains mostly fixes to the old content, a new title, and a new book design (PDF and paperback). Content wise, The Art of PostgreSQL also comes with a new whole chapter about PostgreSQL Extensions.
  • TeaVaR: Striking the Right Utilization-Availability Balance in WAN Traffic Engineering: We advocate a novel approach to this challenge that draws inspiration from financial risk theory: leverage empirical data to generate a probabilistic model of network failures and maximize bandwidth allocation to network users subject to an operator-specified availability target. Our approach enables network operators to strike the utilizationavailability balance that best suits their goals and operational reality. We present TeaVaR (Traffic Engineering Applying Value at Risk), a system that realizes this risk management approach to traffic engineering (TE). 

from High Scalability