Archives

Marathon everRun MX

Presentation Click to view Marathon slides and demo

Marathon everRun MX

everRun® MX: Always-on Application Availability For Windows Environments

You no longer have to settle for application downtime or the risk of data loss in the event of a system failure. Now you can move from a reactive, recovery-based model in which downtime is a given in the event of failures, to a new prevention-based model, in which downtime is prevented. Your organization can compute through failures and have always-on application availability, and the beauty is you can have this in a simple and affordable solution that requires very little IT intervention.

Marathon’s everRun MX is the world’s first software-based, fault tolerance solution that supports single and Symmetric multi-processor and multi-core Microsoft applications. Now businesses with limited IT resources can have a simple and affordable solution to keep applications available during system failures to ensure continuous business operations. With everRun MX, all your Microsoft applications can have fault tolerant protection for a cost that is lower than today’s recovery-based, high availability solutions.

How everRun MX works

Datasheet everRun MX data sheet

everRun MX

Marathon everRun FT

Marathon EverRun FT is a fault-tolerant solution designed for the Windows 2003 and Intel platform. Different from clustering, it works in a very low level and is transparent to the application and OS, that works for any Windows-based application. With no fail-over, downtime and loss of transaction compared to clustering, giving more than 99.999% uptime.In addition to the better availability than a cluster, the overall cost is less.

Datasheet Compare everRun HA and everRun FT

Marathon everRun HA

An alternative to EverRun FT. EverRun HA provides the ability to run ANY windows based application on any hardware, even if they are dissimilar on Windows 2003.

The two servers provide complete redundancy for the virtual environment, allowing I/O to be seamlessly redirected away from a failed device to the redundant device.

Datasheet Compare everRun HA and everRun FT

When a server fails, you continue processing without interruption with no loss of transactions.
Costs less than a cluster. Typically only one application license is required for each virtual server instance
Maintenance-free. No failover to test and maintain. No special skills required.
Runs all Windows applications. No scripting and modifications required.
You choose standard Dell, IBM and HP servers, NEC and Fujitsu etc.
There are large reference accounts in Asia and around the world running the most critical 24×7 applications. Since 1996, with over 3,000 installations.

everRun MX Presentation and Live Video Demos

everRun HA/FT Live Video Demo

Datasheet everRun Products


Marathon Blog

Real Redundancy is a Tenet of True Availability

I grew up in an old, two-family Victorian home just outside Boston, MA. My father, like many dads of the post-World War II generation, had a workshop in the basement. It worked out well because the house needed quite a bit of regular maintenance and repair. As a child, I found the vast array of gear and other paraphernalia in his workshop fascinating. There were tools for framing, plumbing, electrical, landscaping, surveying, automobiles, and … airplanes. (My father was an airplane mechanic in the Pacific during the war.) More interesting was that he seemed to have two of everything. “Why?” I asked. “In case one breaks,” he said. “That’s one of the constants of life.”

Fast forward four decades and my father’s comment comes back to me. I was hosting a webinar on the changing requirements of application availability. Contrary to most contemporary, live webinars, I try to answer all questions that are submitted. One question came from an employee of a competitive hardware vendor in fault tolerant computing. (I don’t keep competitors off of our webinars.) He was complaining that we were not paying sufficient attention to specialized hardware solutions for fault tolerance. It was intentional because, in my opinion, a single piece of hardware has a congenital risk of failure, regardless of whether or not redundancy is “built-in.” I’ve been involved with enough specialized hardware development to argue otherwise. Let’s now go back to the core issue.

The tools in my father’s workshop are like IT resources in any company: we rely on them to be productive and do things beyond our core competencies. But IT resources, just like tools, break. True availability and fault tolerance are built on physical redundancy. Fortunately, it’s now easier than ever, both economically and technologically, to have redundancy keep things going non-stop.

After the webinar, I did reach out to the person with the “one-box” issue, but he wasn’t interested in responding. Too bad because I think it’s time to start having the debate over what availability means. Remember, redundancy is important. Why? To quote my father: “In case one breaks. That’s one of the constants of life.”

Rob Ciampa

The Dangers and Risks of the Norms of Availability

OK. We’re not being coy, even though we have a big “Major Product Announcement” box on the front page of our web site. Really. Am I going to share? Yes, but not now. Instead, I’m going to provide a bit of a drum roll and some recent feedback from many worldwide discussions with some great analysts and thought leaders.

To start off, we’re not just making a product announcement. Rather, we’re proposing an entirely new way to think about availability: one that will actually work – and work for the masses. We’re not pulling any punches. It builds upon years of experience, 14,000+ implementations, and thousands of customers. Combine that with some stunning, technological, price-performance breakthroughs and the game begins to change. It’s a direct assault on what I’ll call the “norms of availability.”

These norms have forced many organizations either into accepting a false sense of security or tolerating downtime that could have been prevented. We’ll be attacking both this week. So what are some of these norms of availability?

  • Recovery is OK because it’s getting faster. Really? Go ask American Eagle Outfitters about their 8-day outage debacle. Given that over 30% of recoveries don’t go as planned, this is dangerous, especially now because the stakes are so high. “Recovery” is still application downtime and still presents a high-risk of data loss. Remember: prevention is better than recovery.
  • Virtualization is availability. Actually, virtualization is consolidation. Virtual machines (VMs) fail and have to be restarted. Virtualization, though very important, is a subset of availability. Want to gauge availability risk with pure-play virtualization? Look at the other pieces such as storage. No big deal? Most organizations will disagree on many fronts, including price and complexity. Virtualization matters (and we’re fans), but availability should be thought of first.
  • DR keeps us going. DR is the nuclear option. It’s a last resort when major catastrophes occur. It should NOT be used when a disk drive fails. (More on this in a subsequent post.) DR is a necessity, but needs to be combined with local protection to make a very powerful availability combination.
  • We have high availability support for our applications. That used to be a good, though it’s really an inherently flawed approach. Remember, it’s still a restart; it’s still complex; and – do the math – it’s still expensive.
  • Cloud computing will solve all our issues. Actually, it may ultimately be a great part of the DR component of broader availability, but it’s just not going to work for localized failures. Time to think holistically.
  • Backups protect us. They sure do, but it may be a day or two late. Can your business afford that? Keep your backups going, but consider other things to amplify availability.
  • I’m not worried because I never had a problem. Wow. Now I’m really scared. You may want to give us a call before you get fired. During recent services work, we’ve helped many of our customers not just identify, but find their critical assets. Do you really know where your important assets are? Do you want to be looking for them when something goes awry?

I’d like to say these are tales of fiction, but they’re not. They’re part of that dangerous norm and we hear this regularly. Fortunately, many organizations are getting better. Shortly, we’ll provide them with availability capabilities that they’ve never had access to, either because of economics, complexity, or scalability.

For the past few months, we’ve been briefing analysts and other experts on what we’re delivering this week (and after). What has their response been?

  • Wow. This is BIG.
  • You’re changing the dynamics of availability.
  • You’ve eliminated the triage IT decision make while tackling availability.
  • You’ve emancipated applications from downtime.
  • This is the “American Dream” of computing.

I like the last one, though I think the rest of the world will appreciate it as well. Stay tuned. We’ll share the word in the next couple of days. And it won’t end there because fault tolerance is about to go mainstream and the implications are substantial. And that’s just the start…

Rob Ciampa

Achieving 24×7 Uptime on a Budget

Last week we hosted a webinar looking at why “High Availability Doesn’t have to be Expensive.” We reviewed the trends that are creating today’s “always-on” world where businesses, customers and employees expect 24×7 uptime for all of their applications. We also highlighted several common sources of planned and unplanned downtime, and identified some specific single points of failure to watch out for. We also discussed two real-world success stories of companies that have achieved always-on affordable fault-tolerant protection with Marathon’s everRun software. Below is a summary of the Q&A portion of the webinar.

Q: How much does everRun cost?
Our starting price is less than $10K USD, for an implementation that can support any type of Microsoft application. So you can get full fault tolerance for less than $10K. Contact us for more information.

Q: Have any of your clients implemented Batchmaster ERP with everRun?
We have a number of organizations running ERP solutions protected by everRun. As long as the application runs in a Windows-based environment, we can support it without restrictions. We have more than 3,000 customers running all types of applications and we haven’t seen any application-level constraints within an Windows environment.

Q: What is the farthest apart two machines can be physically located?
We support SplitSite where we can separate systems by up to 100 miles, depending on your interconnect capacity. We see this a lot at airports for example, where our SplitSite solution has one server in one terminal and the second one in a second terminal.

Q: How do you determine when you should use an FT solution vs. a DR solution?
Fault tolerance and disaster recovery go hand-in-hand, but they are two different things to achieve two different results. When planning your application availability model, you have to have solutions for availability, recovery and back-up for complete protection. When considering day-to-day uptime, that’s availability/fault tolerance to prevent those everyday failures that cause business disruption. But when you are talking about a catastrophic event, like a tornado, hurricane or the like, that’s when your DR solution comes into play. DR means recovery time however, so this is not a good solution for protecting against everyday failures. The other important note here is that testing of your backup/recovery solution is critical. Recent studies have shown that 30% or more of recoveries do not go as planned. You need to test those systems regularly to make sure that your recovery will go as planned. So you need to have a local availability solution for the everyday localized failures and then a DR/back-up solution in the event of a catastrophe.

Q: When using SplitSite to separate your servers, is a T1 connection big enough?
It depends on what the applications are doing and what needs to be kept in lockstep. For smaller applications, a T1 could be sufficient, but if the applications tends to be very busy, then that might not be enough. We can work with you to size that and let you know what you will need for your specific applications and requirements.

Q: Does everRun work with Small Business Server?
Yes it does. We have full support of Small Business Server.

Q: Does everRun work with SQL 2008?
Yes, we have full support for SQL 2008 as well.

Q: Can you force a failover manually (for example a corporate policy expects that an application be tested)?
Yes, you can test the systems live and force components and systems to fail manually to test them and make sure that everything keeps working as planned. We had one customer that had some applications being protected with everRun, and a second system that was not being protected with everRun. There was a disk drive failure on the unprotected systems, so what they did was actually pull the working disk drive from the everRun-protected system and use it to temporarily get the unprotected system back up and running. Even without the disk drive, the everRun protected system kept working. This is obviously not something that we recommend that you do, but it shows just how powerful the everRun solution is and how it can keep applications up and running, even when a disk drive is yanked right out of the system.

Effective Risk Assessment: Q&A

We had a very lively presentation and Q&A during last week’s webinar “How to Cut Risks and Costs with a Downtime Analysis and Action Plan.” A summary of the Q&A is below.

Q: Should branch offices be included in a downtime assessment?
Absolutely – you can’t ignore branch offices. Forrester estimates that 20% of your business comes from branch offices. IT needs to make sure to include those in your assessment plans and budget.

Q: How often should I conduct a business and risk impact assessment?
We’ve found with our customers that an annual assessment is usually sufficient, unless you have some significant kind of change like an acquisition or new location. In that situation you obviously need to do a refresh. You can then use that info moving forward as you conduct your annual assessment.

Q: Is there any available information about rough cost estimates of down time impact in control systems like DCS or SCADA and Historians like the one you showed for IT systems in one of your slides?
We work with a number of ISVs in the process control space including GE, Johnson Controls, Rockwell and many others. We conducted an assessment in a pharmaceutical plant where one minute of downtime lead to the discard of an entire batch, which resulted in a loss of $950,000 to $1.1 million. In process automation and process control, downtime also effects efficiency. We had one company doing waste water treatment and they couldn’t handle the processing levels because of the downtime that they were having, and they were considering opening up a second facility. The assessment revealed that they could actually just retool their existing applications to increase their efficiency and not have to open a second facility. There’s a huge safety element here as well. When some types of systems go down, it can cause significant safety hazards to employees and others. This should also factor in to your downtime risk assessment.

Q: What about hosted applications? How can I incorporate those into my assessment?
Very often, some of your most critical applications are no longer hosted at your site. There’s still obviously extremely important to the business and need to be included as part of your assessment. Treat them exactly the same as your on-site applications, but just make sure that the vendor has the protections in place to keep your applications at the necessary levels to ensure their availability.

Q: With the increased reliance on the Internet, how do you factor the loss of the Internet (i.e. nationwide cyber attack) in risk/mitigation planning?
What we covered in the presentation is mostly what’s under your control, but you do also need to factor in security needs as well. Look at the areas out of your control as well. For example, what would happen to your business if my internet connection is down? Should you have a secondary carrier? ARe you going to go from a T1 connection to some other kind of connection?

Q: Are Marathon’s assessment services delivered primarily as a way to introduce Marathon software into the account, or do you sometimes recommend other software solutions that may be a better fit?
It depends on what you need. Sometimes we’ll go into an organization and do an assessment and they’ll have applications that aren’t necessarily mission critical and they can deal with several hours or days of downtime. What they already have in place might be acceptable for that situation. Or they may be in a situation where they just need disaster recovery. For the instances where there are mission critical applications involved and they can’t tolerate downtime or data loss that’s where we come in.

Q: Would you ever recommend the use of cloud-based VMs for disaster recovery?
It depends on your needs. When you look at the spectrum of availability, there are just so many buzz words and acronyms out there. Fault tolerance, high availability, disaster recovery, business continuity, replication, and on and on. There are efficiencies with cloud-based DR, but the reality is that a lot of these services use a “recovery” model, which means there is downtime involved. These type of services don’t keep your business going during an outage, they just help you to recover after the fact. At Marathon, our focus is on the prevention of downtime and the continuation of business.

Q: Is there a tactic (rule of thumb) you’d recommend to avoid departments classifying everything as mission critical, as everyone believes there app is mission critical.
Every department likes to think that their particular applications are critical to the business. This is why companies like to engage third parties to help them with this process. Companies like Marathon can come in with an objective perspective, ask detailed questions, and provide guidance without any of the internal politics getting in the way.