ETech 2007 Day 2 p.m. sessions

h2. Don MacAskill (SmugMug) -- Set Amazon's Servers on Fire, Not Yours p. Smugmug 140MM photos, no debt, profitable since first year. 192TB stored at S3. Doubling yearly. * S3::Simple Storage Service. $0.15/Gb/month w/replicats. REST API. Fast, not 15K-SCSI fast, but internet fast. h3. Why use them? * not a lot of web scale expertise on planet earth * reputation for systems * [he] once competed with Amazon - fatbrain * They eat their own dogfood. Dozens of products. * Focus on the app, not the muck. h3. Show me the money! * Guestimate: ~$500k save per year * Actual: ** Growth: 64MM photos -> 140MM photos ** Disks would cost $40k -> $100k/month ** $922K would have been spent ** $230K spent instead ** $692K in cold, hard savings * Nasty taxes (on capital goods)! $295K 'saved' in cash flow. Bonus! * Reselling disks to recoupe dunk costs * this is a partial cost of ownership number h3. sweet spots * perfect for startups & small companies * ideal for _store lots, serve little_ businesses of all sizes * not so great (yet) for serving lots if you're a medium or large sized business. Transfer costs high if you can buy bandwidth in 1Gbps+ chunks. * We're a _store lots, serve lots_ company. What to do? h3. Like SmugFS * Architecture remarkably similar to internal Smug filesystem * Similar to lots of startups * Stupid we're all building the same thing * Easy to drop in * Started on Monday, live in on production on Friday h3. S3 evolution * started just doing secondary storage. Too cold! * Tried out as Primary. Too hot! * Finally hot & cold model == just right! * Amazon gets 100% of the data * Smugmug keeps _hot_ data local (about 10%) * 95% reduction in # of disks bought h3. Sample Request * they check for image in cache, if not there they log it and then retrieve the image from S3 and return to client h3. Proxy vs Redirect vs Direct Links * Build SmugMug->S3 with multiple mods * Can flip a switch to change * Nearly 100% served are proxy reads * Sometimes HTTP redirects * Rarely direct S3 links h3. Permissions * SmugMug has complicated permissions * Passwords, privacy, external links * Proxying allows strong protection REST vs SOAP * loves rest, hates SOAP * Lightweight * Nothing useful added with SOAP's complexity h3. Reliability * not 100%, close though * more reliable than SmugFS * no service level agreement * Lots of failure points: ** SmugMug's datacenter ** Internet backbones ** Amazon's datacenter * No other software, hardware, or service [they] use is 100% reliable either h3. Handling failure * Build from day one with dailure in mind. * Stuff breaks, try again * Writes ail? Write locally, sync later. * Reads fail? Handle Intelligently. Alerts? h3. Performance * Fast for reads and writes * Mostly speed of light limited 20-80ms * Parallel I/O for massive throughput. 100s of Mbps * Machine measurable, human indistinguishable h3. CDN? * S3 is not a CDN[content delivery network] * it's storage * no global locations yet * limited edge caching * perhaps a future Amazon web service? h3. How do they do their proxy reads? p. Store and forward vs stream * Store and forward ** great resiliency ** poor performance ** if it's a big file, really poor performance * Stream ** poor resiliency ** great performance ** do a quick HEAD first to verify h3. The speed of light problem * he was misquoted as saying Amazon was slow when trying to explain the speed of light * Amazon has not solved fasther than light data transmission. yet. * unavoidable, make sure your application can tolerate * parallelized I/O can mask problem * caching can help * streaming can help h3. Outages and Problems * not perfect, five major issues * 3 outages of 15-30 minutes, 2 were core switch failures and one DNS problem. Amazon.com affected. * 2 performance degradations. On a smugmug customer noticed, another wasn't noticed * Not a big deal, everything fails, expect it. h3. SLA, Service and Support * Smugmug do not care about SLA, but others might * Service Support: One area where Amazon is weak. ** This is a utility ** They need a service status dashboard ** Pro-active customer notifications ** Ability to get a hold of a human * Support for developers is quite good. * Amazon.com's customer service is good, AWS will likely catch up h3. Saving SmugMug's butts * knocked out power to ~70TB of storage. Oops! * Moved datacenters during normal business hours, customers not affected * Stupid bugs h3. Miscellaneous Tips * use cURL ** fasther ** more reliable ** storing vs streaming is simple * make stuff as asynchronous as possible ** hides speed of light issues ** hides or masks problems ** fast customer service h3. Elastic Compute Cloud (EC2) * Like S3 but for computing ** scale up or down via API ** web servers, procesing boxes, development test beds, etc * Launching large EC2 implementation "soon" ** image processing ** 500k-1M photos/day ** 10-20 terapixels/day processed ** peaky traffic on weekends, holidays ** ridiculously parallel h3. Simple Queue Service (SQS) * Simple, reliable queueing * Mates well with EC2 and S3 ** Stick jobs in SQS ** retrieve jobs with EC2 instances using S3 data ** run jobs, report status to SQS * $0.10/1000 items ** Priced well for small projects ** gets costly for large ones (millions) h3. Missing Pieces * Database API or DB grade EC2 instances ** Fast (lots of local spindles, lots of RAM) ** Persistent * Load Balancer API ** Single IP in front of lots of EC2 instances ** Programmable to add/remove/change clusters ** Can be done with software on an EC2 instance, but painful * CDN Slides to be at http://blogs.smugmug.com/ * How are they using EC2? ** The EC2 instances invoke smugmug APIs to do work. The SmugMug servers don't really know much about EC2 * I asked: Has their use of Amazon been an issue, either to outside investors or customers? ** Not an issue, they have no outside investors, and further they've talked with VCs to raise the issue that startups _should_ be looking at Amazon's services (and if not, why not) h2. Superninja Privacy Techniques for web Application Developers

Marc Hedlud and Brad ...? from Wesabe. Wesabe is a personal finance web application.

  1. Keep critical data local. If there's data you'd never ever ever want to lose, don't put it on a web site.
    • created a wesabe uploader for Mac/Windows to keep bank credentials on your computer. The uploader downloads data from bank sites, strips certain data out of the files, then uploads to Wesabe.
    • don't trust the site. sensitive data filtered before it ever reaches the server.
    • requires a download
    • puts burden on user to maintain a secure machine (same risk as using a web browser to bank)
    • if successful, risk of trojan targeting
  2. Use a privacy wall to separate public and private data
    • use secret key as index in db
    • secret key is only computed when user is logged in (they use hash(password + salt))
    • secret key stored in session data
    • other paths through the db: need to ensure that if you're using a privacy wall all transactions must traverse the privacy wall
    • the data itself can leak information
    • logs and exception reports can capture leaked information
    • password changin and recovery becomes trickier
    • use a _locker_ generate a one time key for user stored in locker
    • encrypt using the locker rather than the password
    • troubleshooting can be harder
  3. Use partitioning to protect against breaches
    • keep pools of sensitive data separate
    • eg membership and financial records kept separate
    • no relationship between them other than status
    • reduces impact of any brach -- firewalls off anything truly identifiable
    • allow separate politices and approaches by data type
    • pretty much zero drawbacks other than implementation time
  4. data fuzzing and log scrubbing
    • (currently) no requirement to retain specific data on users of a server (in the US)
    • Subpoena / warrant may require that you give up all data on a user
    • Different countries have different data retention politices (see epic.org)
    • filter key parameters from logs
    • remove some of the precision of IP addresses
    • remove precision from timestamps since they too can be used to identify someone (cf. example of whistleblower information)
    • prevents leakage of passwords
    • avoids giving attackers / law enforcement a way through the privacy wall
    • loss of certain private data may require you to notify your customers
    • best protection is to delete your logs
    • important to have a public policy in place (cf link to eff.org policy information)
    • no protection against wiretap orders
    • difficult to cover all your bases (use centralized logging)
  5. use voting algorithms to determine public information
    • "the esp game" to tag things at CMU.edu. If two people tag something the same thing at the same time, maybe that's a good tag to apply.
    • look at google image labeller
    • when people agree on a term, it's common knowledge
    • if enough people agree, it's probably publicly known
    • private transactions shouldn't be shown on the site
    • lots of users naming a merchange probably means it's public
    • works on opaque information
    • reliable -- very few faults since launch
    • no manual work needed

    drawbacks

    • information is hidden until threshold met (understates available info)
    • can leak data if threshhold is too low

    miscellaneous

    • hash your passwords. don't store in plaintext.
    • random (non-sequential) database ids. Don't use auto-inc ids in public data.
    • data bill of rights -- your data is your data. can export, delete, etc.

    more information

    Posted in ETech

    Archives

202: Accepted Archives

Feed icon We use Feedburner to distribute our web feeds: 202 Accepted Feed

feedburner graphic
Google

Copyright 2002–2011 Artific Consulting LLC.

Unless otherwise noted, content is licensed for reuse under the Creative Commons Attribution-ShareAlike 3.0 License. Please read and understand the license before repurposing content from this site.