h2. Don MacAskill (SmugMug) -- Set Amazon's Servers on Fire, Not Yours
p. Smugmug 140MM photos, no debt, profitable since first year. 192TB stored at S3. Doubling yearly.
* S3::Simple Storage Service. $0.15/Gb/month w/replicats. REST API. Fast, not 15K-SCSI fast, but internet fast.
h3. Why use them?
* not a lot of web scale expertise on planet earth
* reputation for systems
* [he] once competed with Amazon - fatbrain
* They eat their own dogfood. Dozens of products.
* Focus on the app, not the muck.
h3. Show me the money!
* Guestimate: ~$500k save per year
* Actual:
** Growth: 64MM photos -> 140MM photos
** Disks would cost $40k -> $100k/month
** $922K would have been spent
** $230K spent instead
** $692K in cold, hard savings
* Nasty taxes (on capital goods)! $295K 'saved' in cash flow. Bonus!
* Reselling disks to recoupe dunk costs
* this is a partial cost of ownership number
h3. sweet spots
* perfect for startups & small companies
* ideal for _store lots, serve little_ businesses of all sizes
* not so great (yet) for serving lots if you're a medium or large sized business. Transfer costs high if you can buy bandwidth in 1Gbps+ chunks.
* We're a _store lots, serve lots_ company. What to do?
h3. Like SmugFS
* Architecture remarkably similar to internal Smug filesystem
* Similar to lots of startups
* Stupid we're all building the same thing
* Easy to drop in
* Started on Monday, live in on production on Friday
h3. S3 evolution
* started just doing secondary storage. Too cold!
* Tried out as Primary. Too hot!
* Finally hot & cold model == just right!
* Amazon gets 100% of the data
* Smugmug keeps _hot_ data local (about 10%)
* 95% reduction in # of disks bought
h3. Sample Request
* they check for image in cache, if not there they log it and then retrieve the image from S3 and return to client
h3. Proxy vs Redirect vs Direct Links
* Build SmugMug->S3 with multiple mods
* Can flip a switch to change
* Nearly 100% served are proxy reads
* Sometimes HTTP redirects
* Rarely direct S3 links
h3. Permissions
* SmugMug has complicated permissions
* Passwords, privacy, external links
* Proxying allows strong protection
REST vs SOAP
* loves rest, hates SOAP
* Lightweight
* Nothing useful added with SOAP's complexity
h3. Reliability
* not 100%, close though
* more reliable than SmugFS
* no service level agreement
* Lots of failure points:
** SmugMug's datacenter
** Internet backbones
** Amazon's datacenter
* No other software, hardware, or service [they] use is 100% reliable either
h3. Handling failure
* Build from day one with dailure in mind.
* Stuff breaks, try again
* Writes ail? Write locally, sync later.
* Reads fail? Handle Intelligently. Alerts?
h3. Performance
* Fast for reads and writes
* Mostly speed of light limited 20-80ms
* Parallel I/O for massive throughput. 100s of Mbps
* Machine measurable, human indistinguishable
h3. CDN?
* S3 is not a CDN[content delivery network]
* it's storage
* no global locations yet
* limited edge caching
* perhaps a future Amazon web service?
h3. How do they do their proxy reads?
p. Store and forward vs stream
* Store and forward
** great resiliency
** poor performance
** if it's a big file, really poor performance
* Stream
** poor resiliency
** great performance
** do a quick HEAD first to verify
h3. The speed of light problem
* he was misquoted as saying Amazon was slow when trying to explain the speed of light
* Amazon has not solved fasther than light data transmission. yet.
* unavoidable, make sure your application can tolerate
* parallelized I/O can mask problem
* caching can help
* streaming can help
h3. Outages and Problems
* not perfect, five major issues
* 3 outages of 15-30 minutes, 2 were core switch failures and one DNS problem. Amazon.com affected.
* 2 performance degradations. On a smugmug customer noticed, another wasn't noticed
* Not a big deal, everything fails, expect it.
h3. SLA, Service and Support
* Smugmug do not care about SLA, but others might
* Service Support: One area where Amazon is weak.
** This is a utility
** They need a service status dashboard
** Pro-active customer notifications
** Ability to get a hold of a human
* Support for developers is quite good.
* Amazon.com's customer service is good, AWS will likely catch up
h3. Saving SmugMug's butts
* knocked out power to ~70TB of storage. Oops!
* Moved datacenters during normal business hours, customers not affected
* Stupid bugs
h3. Miscellaneous Tips
* use cURL
** fasther
** more reliable
** storing vs streaming is simple
* make stuff as asynchronous as possible
** hides speed of light issues
** hides or masks problems
** fast customer service
h3. Elastic Compute Cloud (EC2)
* Like S3 but for computing
** scale up or down via API
** web servers, procesing boxes, development test beds, etc
* Launching large EC2 implementation "soon"
** image processing
** 500k-1M photos/day
** 10-20 terapixels/day processed
** peaky traffic on weekends, holidays
** ridiculously parallel
h3. Simple Queue Service (SQS)
* Simple, reliable queueing
* Mates well with EC2 and S3
** Stick jobs in SQS
** retrieve jobs with EC2 instances using S3 data
** run jobs, report status to SQS
* $0.10/1000 items
** Priced well for small projects
** gets costly for large ones (millions)
h3. Missing Pieces
* Database API or DB grade EC2 instances
** Fast (lots of local spindles, lots of RAM)
** Persistent
* Load Balancer API
** Single IP in front of lots of EC2 instances
** Programmable to add/remove/change clusters
** Can be done with software on an EC2 instance, but painful
* CDN
Slides to be at http://blogs.smugmug.com/
* How are they using EC2?
** The EC2 instances invoke smugmug APIs to do work. The SmugMug servers don't really know much about EC2
* I asked:
Has their use of Amazon been an issue, either to outside investors or customers?
** Not an issue, they have no outside investors, and further they've talked with VCs to raise the issue that startups _should_ be looking at Amazon's services (and if not, why not)
h2. Superninja Privacy Techniques for web Application Developers
Marc Hedlud and Brad ...? from Wesabe. Wesabe is a personal finance web application.
- Keep critical data local. If there's data you'd never ever ever want to lose, don't put it on a web site.
- created a wesabe uploader for Mac/Windows to keep bank credentials on your computer. The uploader downloads data from bank sites, strips certain data out of the files, then uploads to Wesabe.
- don't trust the site. sensitive data filtered before it ever reaches the server.
- requires a download
- puts burden on user to maintain a secure machine (same risk as using a web browser to bank)
- if successful, risk of trojan targeting
- Use a privacy wall to separate public and private data
- use secret key as index in db
- secret key is only computed when user is logged in (they use hash(password + salt))
- secret key stored in session data
- other paths through the db: need to ensure that if you're using a privacy wall all transactions must traverse the privacy wall
- the data itself can leak information
- logs and exception reports can capture leaked information
- password changin and recovery becomes trickier
- use a _locker_ generate a one time key for user stored in locker
- encrypt using the locker rather than the password
- troubleshooting can be harder
- Use partitioning to protect against breaches
- keep pools of sensitive data separate
- eg membership and financial records kept separate
- no relationship between them other than status
- reduces impact of any brach -- firewalls off anything truly identifiable
- allow separate politices and approaches by data type
- pretty much zero drawbacks other than implementation time
- data fuzzing and log scrubbing
- (currently) no requirement to retain specific data on users of a server (in the US)
- Subpoena / warrant may require that you give up all data on a user
- Different countries have different data retention politices (see epic.org)
- filter key parameters from logs
- remove some of the precision of IP addresses
- remove precision from timestamps since they too can be used to identify someone (cf. example of whistleblower information)
- prevents leakage of passwords
- avoids giving attackers / law enforcement a way through the privacy wall
- loss of certain private data may require you to notify your customers
- best protection is to delete your logs
- important to have a public policy in place (cf link to eff.org policy information)
- no protection against wiretap orders
- difficult to cover all your bases (use centralized logging)
- use voting algorithms to determine public information
- "the esp game" to tag things at CMU.edu. If two people tag something the same thing at the same time, maybe that's a good tag to apply.
- look at google image labeller
- when people agree on a term, it's common knowledge
- if enough people agree, it's probably publicly known
- private transactions shouldn't be shown on the site
- lots of users naming a merchange probably means it's public
- works on opaque information
- reliable -- very few faults since launch
- no manual work needed
drawbacks
- information is hidden until threshold met (understates available info)
- can leak data if threshhold is too low
miscellaneous
- hash your passwords. don't store in plaintext.
- random (non-sequential) database ids. Don't use auto-inc ids in public data.
- data bill of rights -- your data is your data. can export, delete, etc.
more information
Posted in ETech