Most Read Latest News Blog Resources

Balancing the Question of Scale


To elegantly deliver Web applications to new and different users, such as phone and PDA, an open, standards-based approach is best



April 1, 2005 — 
Coursework.stanford.edu had humble enough beginnings in the late 1990s.

“It started as a research project, and I was the only developer working on it,” said Scott Stocker, a former history master’s student who today is director of Web communications at Stanford University in Palo Alto, Calif.

Currently, the course management Web application is used by roughly 600 professors each quarter to post assignments, foster online discussion and administer quizzes. Stocker, who has long since moved on and up the Stanford IT hierarchy, left behind a full-time staff of four to manage the application he built from scratch.

Managing application scaling, whether on a single university server or a massively parallel and distributed commercial system, is a challenge that just about every coder will encounter at some point in a career. And mobility only compounds the scaling issue, as new phone- and PDA-powered users begin banging on Web applications designed for desktop and laptop browsers.

From academia to industry, hands-on coders are using a handful of best practices to address an explosion of scaling issues. While tools are available to help, an eat-your-vegetables kind of common sense seems to be the first step toward elegantly offering Web applications to new and different types of users.

Scaling at Stanford
Stocker’s users were professors who, despite having impressive resumes and jobs at tech-steeped Stanford, had varying degrees of Web competency. The application had to be flexible (to be useful both to the Web pros and novices on the faculty), robust (if it crashed or was buggy, no one would use it), and inexpensive (Stocker was a staff of one, funded by a small grant from the Andrew W. Mellon foundation to develop new learning management tools).

The approach, as is fairly common in academia, was to get the ball rolling with open standards and open source. Stocker built on top of the Linux operating system and MySQL database. Even though it was a pre-J2EE world, Stocker was already hooked on Java and relied heavily on servlets and JavaServer Pages (JSPs).

Unlike CGI programs, Java servlets are persistent, standing by in memory to fulfill multiple requests once they’re started. And beyond the benefits of separating a Web page’s logic from its static elements, JSPs aren’t restricted to any specific platform or server.

As coursework.stanford.edu moved from this-might-actually-work to mission-critical, Stanford’s main IT shop eventually stepped in to support Stocker’s creation. Stanford IT is a Solaris/

Oracle database environment, but the Java APIs, especially Java Database Connectivity (JDBC), plugged in easily enough to this back end.

The JDBC API allows Java programs to interact with any SQL-compliant database. Since nearly all relational database management systems (DBMSes) support SQL, and because Java itself runs on most platforms, JDBC makes it possible to write a single database application that can run on different platforms and interact with different DBMSes.

Stocker continues his open-source ways today. His latest project, Stanford’s public event calendar (events.stanford.edu), is as low-budget as it gets. The entire application runs on a single Red Hat Linux-powered Dell server in his office. A second Dell machine, which he uses as his staging environment, doubles as his hot spare.

“If something happens, I can just switch the DNS and I’m live again.”

Using the Apache Software Foundation’s Struts tags—a library of routines for many common coding actions—freed up Stocker to focus mostly on the business logic and presentation layer. And his intimate familiarity with and ownership of the codebase made it easier to scale the application to new customers, such as the Stanford College of Engineering, which now offers a flavor a Stocker’s calendar on its Web site.

Scaling at Powells.com
Nearly 700 miles north, in rainier and more coffee-soaked Portland, Ore., Darin Sennett is another fan of scaling with open standards and open source. Sennett is the director of Web stuff (his actual title) for Powell’s, the self-described “legendary independent bookstore.”

Powell’s launched its first rudimentary Web site in 1994, a year before Amazon.com appeared online. The site’s e-commerce application was decidedly human-centric.

“In the beginning, there were just three of us—the first programmer/designer of the system, and a woman and I who would receive orders [via e-mail], pull books from the shelves in the store, respond to e-mail, sell the books at the register and then box them up ourselves,” Sennett explained. “When we added our main store’s database, I was the customer service department and answered e-mails eight hours a day all summer, while the woman I worked with sold all the orders on a ten-key. Our shipping department consisted of one person with a tape gun and some recycled boxes.”

Powells.com has grown significantly since then. In January 1995, online sales were US$8,000—a tiny percentage of the business—and the site was averaging 470 searches per day. By 2003, 40 percent of Powell’s sales were coming through its Web site, and last year the Internet operations, including the 15-person shipping department and 20 technical and production employees, moved to a new 60,000-foot warehouse.

Powell’s may be the largest independent bookstore online, but its Web site is still small potatoes next to Amazon.com, which now does US$6 billion of business annually selling more than 20 million products online, including all 29 colors of the KitchenAid five-quart mixer.

“We’ve never had a big wad of investor cash, so we’ve had to work within profitability,” said Sennett. “We’ve been choosing food over special effects from the very beginning.”

Like Stocker, Sennett has dealt with lean budgeting by relying heavily on open standards and open source. He ticks off a familiar litany of fixtures atop which the site sits—Sun machines, the Solaris operating system, Apache Web server software, the MySQL database and lots of hand-coded Perl and PHP scripts.

Scaling for Sennett is more about end users and experimentation than the latest, greatest application architecture. He said he doesn’t worry so much about keeping up with technology, but instead just tries to sell books online better and more efficiently each day.

Which isn’t to say that Powell’s is anti-programming progress. Sennett is proud of the fact that Powells.com predates Amazon.com and that his site offered a shopping-cart feature before “shopping cart” had entered the Web lexicon.

Sennett’s current scaling challenge is transforming his site with Web standards that, unlike old-fashioned HTML, can potentially adapt to different types of output. A site redesign is under way, and one of Sennett’s programmers is at work emulating the current site structure in Cascading Style Sheets (CSS).

For now, new Really Simple Syndication (RSS) of all the database-driven content on Powells.com is a first step toward standards that will make it easier to scale the site’s content to mobile users or otherwise.

One category available to RSS subscribers is Powell’s Review-A-Day service. Supported by content-sharing agreements with literary fellow travelers, new reviews are delivered daily to the Powell’s site. Atlantic Monthly, Christian Science Monitor, Esquire, New Republic, Salon.com and Times Literary Supplement all contribute, meaning there’s just one day each week when Powell’s overworked staffers have to pen reviews of their own.

It may seem quaint to consider content itself a scaling challenge. However, there’s no reason to race to offer syndication if new information isn’t appearing on a site with some regularity.

Scaling Tools Providers
For all their anti-establishment tendencies, it’s no surprise that academia and independent bookstores would try to get as much mileage as possible out of open-source and do-it-yourself code. But in much of big business’s mega-online operations, managing complexity and scale with proprietary software is the norm.

Indeed, many software companies have amassed fearsome market caps by providing applications that help companies to stitch together, scale and manage increasingly complex Web environments. One is Computer Associates, which provides software that helps manage the infrastructures of more than 95 percent of Fortune 500 companies, according to its Web site.

Paul Lipton, technology strategist for Islandia, N.Y.-based CA, has scaling advice for Web managers trying to keep costs down and handle growing groups of window shoppers—people who stop by a site and consume computing resources, perhaps to do some what-if travel planning, without ever making a purchase. The key, he said, is watching both Web traffic and Web server performance.

“What happens if you are just analyzing visitors’ access and behavior, but not monitoring availability or performance of your Web servers? In this scenario, you may not realize soon enough that your Web server went down, or that you are running out of system resources,” Lipton said. “Or what if you monitor the performance of your Web servers without paying attention to visitors and patterns followed on your Web site? In this case, you are probably not maximizing revenue.”

CA’s advice—the company has grown into one of the world’s powerful software companies by proffering it—is to use a complete Web management solution that can manage from IT, end-user and business-user perspectives all at once.

Tom Murphy, director of application service management at software storage giant Veritas—now merged with Symantec—echoed Lipton’s remarks.

“Customer-facing Web-based applications need to ensure acquisition and retention,” Murphy said. “Organizations worldwide invest heavily in the design and architecture of Web-based applications, but frequently don’t take the time to learn about the real end users’ experience. How do organizations know for sure that potential customers aren’t leaving their Web site because of poor performance?”

In increasingly complex IT environments, it’s hard to know anything for sure about a Web site, and even the best traffic and performance analyzers can miss glitches and gremlins that frustrate end users.

Looking down the road for tools to fix this scaling- and complexity-related challenge, at least a few promising signposts appear. One is TeaLeaf Technology (www.tealeaf.com), which gives Web managers a click-by-click view of how customers browse their sites.

The company’s RealiTea application offers an instant replay of sorts, allowing developers to play back the page views and navigation choices leading up to a customer-perceived problem. Tower Records, for one, is using the application to help refine its checkout and search as it tries to turn more of the 60,000 to 70,000 visitors to its Web site into paying customers.

Another player helping to scale and troubleshoot complex Web environments is start-up Splunk Technology. CEO Michael Baum, a veteran of IBM, Infoseek, Yahoo and at least two earlier start-ups, explained the problem his team is setting out to solve in a Splunk homepage blog posting:

“A typical server today can log more than a gigabyte a day, and a small data center can generate over a terabyte of operational data a week,” Baum said. “In addition to the problems of scaling traditional solutions to data of this magnitude, variety and frequency of change, making meaning out of the data is still a difficult task for even the most experienced technical staff.”

There’s no word on when venture-backed Splunk—its name is derived from spelunk, to explore natural caves—will release its first product to help developers explore IT systems.

Even middleware behemoth BEA Systems is wading into the scaling and complexity fray. The company’s WebLogic Server 9 product, available in beta, provides new instrumentation to help monitor functionality and improve performance.

“The instrumentation framework is straightforward but powerful, and quite simply it works like this: You identify the points within your applications or within WebLogic Server that you’d like to monitor, and you specify actions to be taken at those instrumentation points,” explained BEA product marketing manager John Doppke. “This allows you, for example, to time how long a method call is taking, to log a stack trace whenever a method is called, etc.”

Mobile-Friendly Standards
Scaling issues don’t come up only when talking about diverse Web application stacks. They also come up when addressing the diverse nature of Web users, more and more of whom are armed with mobile devices hungry for rich data and services. Harris Hutkin’s job is to think about how to best feed this growing mobile mob.

As a senior mobile product manager at Time Inc., Hutkin is awash in content, from magazine articles to movie trailers. His challenge is scaling that content to a diverse array of portable platforms. It’s a challenge likely faced by lots of developers working on sites that predate the late 1990s tech bubble—Harris’ dividing line for old-school and new-school Web shops.

“The old school is those people who originally discovered the Web,” he said. “They built their HTML pages manually, mixing content and applications and relying on lots of custom coding.

“The new school is just about every company that’s formed since the bubble burst,” Hutkin continued. “Web infrastructure for these companies is marked by use of standards, such as XML. Data is separate from design.”

The new school appears to be in a much better position to address mobile users. Standards-based architectures are more efficient at producing HTML or Wireless Markup Language (WML) pages. The whole mobile thing isn’t a big deal for these shops, Hutkin said, since Wireless Application Protocol (WAP) 2.0 was defined to use standards-compliant data (XHTML) and style sheets (WAP CSS).

There’s no easy answer to dealing with vast amounts of legacy pages in which simple text and data are intermingled with HTML design elements. One approach is to write scripts to look for old HTML code in the data. The scripts can replace the HTML with procedural calls (procs), strip it out altogether, or notify an editor about the poorly formed content.

“By having these procs in the data, instead of the normal HTML tags, the template that builds the page can process the proc depending on where the data needs to be displayed [Web, mobile device, etc.] and generate an appropriate file that can be read by the reader,” Hutkin said.

Benignly named Good Technology seems to understand this spectrum of bootstrapped to standards-based, Web services-friendly code that mobile surfers may encounter. The company’s GoodAccess and GoodLink products allow companies to offer their enterprise applications—including Microsoft Exchange, Oracle, Salesforce.com and Siebel—to mobile, handset-carrying employees.

“We recognize that as the transition is occurring to a Web services-based architecture, there are going to be many systems [particularly custom systems or intranets, for example] that do not have Web services,” said Dennis Yang, a Good Technology senior product manager. “Therefore, it is important to ensure that the platform has some kind of transformation capability to turn a Web page into something suitable for a mobile device.”

Yang added that the GoodAccess platform provides both Web services-based access and transformations of custom applications into mobile-friendly output.

But the questions are different when it’s an issue of content for the masses instead of access to a few important applications for employees inside a company computing environment.

“Are you prepared to offer all the content on your site in WML format, for instance?” Hutkin asked. “If your pages don’t render well across devices, then you may have to consider producing a version of the site in WML 1.x, since 1.x is the least common denominator.”

Even this least-common-denominator approach isn’t cheap, as WML pages still need to be built. For content sites, generating an article page isn’t difficult. Create a WML template, drop in the copy, and you’re done, according to Hutkin. But getting users to those article pages is another story.

“Unless you have a standards-compliant homepage, you’re going to have to build a special homepage and any other navigational pages for WAP users to navigate your content,” Hutkin said. “This could be a time-consuming process.”

So how does the mobility cost-benefit equation work out today at Time? For an answer, consider that the company just signed a contract with U.K.-based Flytxt (www.flytxt.com), a Short Message Service messaging platform provider. Hutkin said that despite all the rich-content, mobile-friendly visions of the future, he believes that Time needs to go where the users are right now—SMS.

“Ultimately, I believe everyone is hoping that the [standards bodies such as] Open Mobile Alliance [www.openmobilealliance.com] will develop standards that make many of these current issues insignificant,” Hutkin said. “Until that time, we’re focusing on text-based programs, which are easier to develop and usable by the largest audience.

“We’ll be doing some WAP tests, but you won’t be seeing a WAP version of any of our titles in the short term,” he continued. “Development costs to create and maintain custom homepages are too great.”


Share this link: http://www.sdtimes.com/link/28524
 

Add comment


Name*
Email*  
Country     


  • Comment
  • Preview
Loading



 
 
 
 
News on Monday
more>>
SharePoint Tech Report
more>>


   

 
 
Download Current Issue
ISSUE 3/15/2010 PDF

Need Back Issues?
DOWNLOAD HERE

Receive the print Edition?


 
blogs tab
Google Code turns 5
Google Code Turns 5, and adds a Paxos Algorithm to make the system more stable and reliable.
03/17/2010 11:16 AM EST

Test your Visual Studio 2010 know-how
Microsoft is offering free beta certification exams for Visual Studio 2010.
03/17/2010 11:08 AM EST

Microsoft lifts the hood on IE9
Microsoft is previewing IE9.
03/16/2010 01:10 PM EST

 

Events calendar tab
3/22/2010 to 3/25/2010
Santa Clara, Calif.
The Eclipse Foundation

4/12/2010 to 4/14/2010
Las Vegas
Penton Media

4/12/2010 to 4/15/2010
Santa Clara, Calif.
O'Reilly Media

4/19/2010
New York City
Flagg Management

4/25/2010 to 4/28/2010
Overland Park, Kans.
IIUG