ETech 2007 Day 2 p.m. sessions

h2. Don MacAskill (SmugMug) -- Set Amazon's Servers on Fire, Not Yours p. Smugmug 140MM photos, no debt, profitable since first year. 192TB stored at S3. Doubling yearly. * S3::Simple Storage Service. $0.15/Gb/month w/replicats. REST API. Fast, not 15K-SCSI fast, but internet fast. h3. Why use them? * not a lot of web scale expertise on planet earth * reputation for systems * [he] once competed with Amazon - fatbrain * They eat their own dogfood. Dozens of products. * Focus on the app, not the muck. h3. Show me the money! * Guestimate: ~$500k save per year * Actual: ** Growth: 64MM photos -> 140MM photos ** Disks would cost $40k -> $100k/month ** $922K would have been spent ** $230K spent instead ** $692K in cold, hard savings * Nasty taxes (on capital goods)! $295K 'saved' in cash flow. Bonus! * Reselling disks to recoupe dunk costs * this is a partial cost of ownership number h3. sweet spots * perfect for startups & small companies * ideal for _store lots, serve little_ businesses of all sizes * not so great (yet) for serving lots if you're a medium or large sized business. Transfer costs high if you can buy bandwidth in 1Gbps+ chunks. * We're a _store lots, serve lots_ company. What to do? h3. Like SmugFS * Architecture remarkably similar to internal Smug filesystem * Similar to lots of startups * Stupid we're all building the same thing * Easy to drop in * Started on Monday, live in on production on Friday h3. S3 evolution * started just doing secondary storage. Too cold! * Tried out as Primary. Too hot! * Finally hot & cold model == just right! * Amazon gets 100% of the data * Smugmug keeps _hot_ data local (about 10%) * 95% reduction in # of disks bought h3. Sample Request * they check for image in cache, if not there they log it and then retrieve the image from S3 and return to client h3. Proxy vs Redirect vs Direct Links * Build SmugMug->S3 with multiple mods * Can flip a switch to change * Nearly 100% served are proxy reads * Sometimes HTTP redirects * Rarely direct S3 links h3. Permissions * SmugMug has complicated permissions * Passwords, privacy, external links * Proxying allows strong protection REST vs SOAP * loves rest, hates SOAP * Lightweight * Nothing useful added with SOAP's complexity h3. Reliability * not 100%, close though * more reliable than SmugFS * no service level agreement * Lots of failure points: ** SmugMug's datacenter ** Internet backbones ** Amazon's datacenter * No other software, hardware, or service [they] use is 100% reliable either h3. Handling failure * Build from day one with dailure in mind. * Stuff breaks, try again * Writes ail? Write locally, sync later. * Reads fail? Handle Intelligently. Alerts? h3. Performance * Fast for reads and writes * Mostly speed of light limited 20-80ms * Parallel I/O for massive throughput. 100s of Mbps * Machine measurable, human indistinguishable h3. CDN? * S3 is not a CDN[content delivery network] * it's storage * no global locations yet * limited edge caching * perhaps a future Amazon web service? h3. How do they do their proxy reads? p. Store and forward vs stream * Store and forward ** great resiliency ** poor performance ** if it's a big file, really poor performance * Stream ** poor resiliency ** great performance ** do a quick HEAD first to verify h3. The speed of light problem * he was misquoted as saying Amazon was slow when trying to explain the speed of light * Amazon has not solved fasther than light data transmission. yet. * unavoidable, make sure your application can tolerate * parallelized I/O can mask problem * caching can help * streaming can help h3. Outages and Problems * not perfect, five major issues * 3 outages of 15-30 minutes, 2 were core switch failures and one DNS problem. Amazon.com affected. * 2 performance degradations. On a smugmug customer noticed, another wasn't noticed * Not a big deal, everything fails, expect it. h3. SLA, Service and Support * Smugmug do not care about SLA, but others might * Service Support: One area where Amazon is weak. ** This is a utility ** They need a service status dashboard ** Pro-active customer notifications ** Ability to get a hold of a human * Support for developers is quite good. * Amazon.com's customer service is good, AWS will likely catch up h3. Saving SmugMug's butts * knocked out power to ~70TB of storage. Oops! * Moved datacenters during normal business hours, customers not affected * Stupid bugs h3. Miscellaneous Tips * use cURL ** fasther ** more reliable ** storing vs streaming is simple * make stuff as asynchronous as possible ** hides speed of light issues ** hides or masks problems ** fast customer service h3. Elastic Compute Cloud (EC2) * Like S3 but for computing ** scale up or down via API ** web servers, procesing boxes, development test beds, etc * Launching large EC2 implementation "soon" ** image processing ** 500k-1M photos/day ** 10-20 terapixels/day processed ** peaky traffic on weekends, holidays ** ridiculously parallel h3. Simple Queue Service (SQS) * Simple, reliable queueing * Mates well with EC2 and S3 ** Stick jobs in SQS ** retrieve jobs with EC2 instances using S3 data ** run jobs, report status to SQS * $0.10/1000 items ** Priced well for small projects ** gets costly for large ones (millions) h3. Missing Pieces * Database API or DB grade EC2 instances ** Fast (lots of local spindles, lots of RAM) ** Persistent * Load Balancer API ** Single IP in front of lots of EC2 instances ** Programmable to add/remove/change clusters ** Can be done with software on an EC2 instance, but painful * CDN Slides to be at http://blogs.smugmug.com/ * How are they using EC2? ** The EC2 instances invoke smugmug APIs to do work. The SmugMug servers don't really know much about EC2 * I asked: Has their use of Amazon been an issue, either to outside investors or customers? ** Not an issue, they have no outside investors, and further they've talked with VCs to raise the issue that startups _should_ be looking at Amazon's services (and if not, why not) h2. Superninja Privacy Techniques for web Application Developers

Marc Hedlud and Brad ...? from Wesabe. Wesabe is a personal finance web application.

Keep critical data local. If there's data you'd never ever ever want to lose, don't put it on a web site.

created a wesabe uploader for Mac/Windows to keep bank credentials on your computer. The uploader downloads data from bank sites, strips certain data out of the files, then uploads to Wesabe.
don't trust the site. sensitive data filtered before it ever reaches the server.
requires a download
puts burden on user to maintain a secure machine (same risk as using a web browser to bank)
if successful, risk of trojan targeting

Use a privacy wall to separate public and private data

use secret key as index in db
secret key is only computed when user is logged in (they use hash(password + salt))
secret key stored in session data
other paths through the db: need to ensure that if you're using a privacy wall all transactions must traverse the privacy wall
the data itself can leak information
logs and exception reports can capture leaked information

password changin and recovery becomes trickier
use a _locker_ generate a one time key for user stored in locker
encrypt using the locker rather than the password
troubleshooting can be harder

Use partitioning to protect against breaches

keep pools of sensitive data separate
eg membership and financial records kept separate
no relationship between them other than status
reduces impact of any brach -- firewalls off anything truly identifiable
allow separate politices and approaches by data type
pretty much zero drawbacks other than implementation time

data fuzzing and log scrubbing

(currently) no requirement to retain specific data on users of a server (in the US)
Subpoena / warrant may require that you give up all data on a user
Different countries have different data retention politices (see epic.org)
filter key parameters from logs
remove some of the precision of IP addresses
remove precision from timestamps since they too can be used to identify someone (cf. example of whistleblower information)

prevents leakage of passwords
avoids giving attackers / law enforcement a way through the privacy wall
loss of certain private data may require you to notify your customers
best protection is to delete your logs
important to have a public policy in place (cf link to eff.org policy information)

no protection against wiretap orders
difficult to cover all your bases (use centralized logging)

use voting algorithms to determine public information

"the esp game" to tag things at CMU.edu. If two people tag something the same thing at the same time, maybe that's a good tag to apply.
look at google image labeller
when people agree on a term, it's common knowledge
if enough people agree, it's probably publicly known
private transactions shouldn't be shown on the site
lots of users naming a merchange probably means it's public
works on opaque information
reliable -- very few faults since launch
no manual work needed

drawbacks

information is hidden until threshold met (understates available info)
can leak data if threshhold is too low

miscellaneous

hash your passwords. don't store in plaintext.
random (non-sequential) database ids. Don't use auto-inc ids in public data.
data bill of rights -- your data is your data. can export, delete, etc.

more information

eff guidelines for online service providers http://www.eff.org/osp/
brad's blog: http://blog.footle.org/
privacy software tools for web apps http://dev.riseup.net/privacy/

Posted in ETech

202: Accepted Archives

Unless otherwise noted, content is licensed for reuse under the Creative Commons Attribution-ShareAlike 3.0 License. Please read and understand the license before repurposing content from this site.