Symantec Gossip

Please post interesting insights into this vendor’s business and products/services here.

If you wish to post anonymously, please email david.ferris@ferris.com. Please identify yourself so that David can determine that the posting should be taken seriously. Alternatively, you can phone him on +1 415 367 3436. We’ll post your material, without identifying you.

One Comment

  1. dferris
    Posted August 8, 2008 at 2:36 AM | Permalink

    Post from someone who requested anonymity

    EV deserves to be where it is. The architecture behind EV (designed by the engineering team at Digital in 1997) is still the best out there – the largest customer in Germany had 110.000 seats producing 1 billion emails in the last 5 years. If you find gaps, you have to remember some things: It is nearly 10 years old. It has over 8000 customers and you simply cannot change the underlying architecture in one release. Your installed base would kill you if you came up with a completely buggy redesign, especially as archiving has to do with predictability and evolution (not revolution). Next release will tackle the storage model (50% savings through better SIS) and offline features, the release after that will probably be the shift to 64-bits and a new primary indexing engine.

    Overall the next phase of the evolution will be a much larger 3rd-party eco-system. I can see people delivering voice (Call center) archiving and medical records management (DICOM format).

    Today it is surely the safest choice to go with EV as it has all you need for accounts up to 5000 people the chances of having a nightmare project are close to zero, at least if the consultant has a slight idea about what he is doing.

    Did you know that except for Nigel Dutt (the KVS CTO) and his wife, the whole 18-people engineering team from Digital is still there, now with the yellow Symantec badges? The engineering and product management team for EV at Symantec is nearly 200 people. High on their priority list is to make the product so simple to install and use that you can put it in <100 seat accounts.

    Some other insider info: For EV you have a team of around 5 people working on each of the services (Index Service, Storage service, etc.) in Visual Studio Team Foundation server, using the same development environment that Microsoft uses. It is probably one of the best documented software products on the planet (e.g. Check out the EV Performance guide and you get a good impression about the level of detail), which in turn has slowed down the team in the past – the guys at SYMC are either in a meeting or in a code review /documentation review process.

    Symantec can easily maintain and enhance the product as the real pain happened after the KVS/Veritas acquisition (Believe me!). Today they are in better shape than ever. And that’s the key reason Nick Mehta left: There is not much left to do for the next few years, apart from executing the plan.

  2. analyst
    Posted September 5, 2008 at 4:39 PM | Permalink

    There’s a bit of Symantec spin in the above posting that needs a reality check.

    1) Claiming that the engg team came from Digital (DEC) is itself a proclamation of very dated technology from a company long gone. Indeed, the original problem that eV set out to solve was email storage, which was managing email ONE MAILBOX AT A TIME. The entire eV architecture was oriented towards that. When e-Discovery and Compliance needs surfaced, eV’s architecture was simply unsuited to searching and managing ACROSS ALL MAILBOXES. It was the equivalent of making small water tanks for each apartment, and then having customers ask for a massive water tank for the entire building. Symantec has not been able to make that massive storage tank.

    2) The claim of 110,000 seats in Germany versus the recommendation to select eV for 5,000 seats is contradictory. The real explanation is that the 110,000 seats is probably all about storage reduction, which is a lot easier to handle since: (a) it only handles big attachments which affect only 10% of the emails; and (b) the user needs only to search one mailbox at a time. It would be a much more credible claim to say that eV can find, say, “harassment” across 110,000 mailboxes in seconds or minutes. Chances are that eV will take MONTHS to return that result. And that’s if it doesn’t time out first due to limited threads.

    3) The claim that single instance storage (SIS) will be improved in the next release by cutting storage down 50% is a direct implication that eV cannot do SIS well today. Indeed, eV’s SIS is so fragile, it is broken by a myriad of factors, including different email sources, different email/file applications, different Exchange/Domino mail servers, different vaults, etc. It’s a wonder eV can even speak of single instance in any way, shape or form.

    4) With regard to the claim that eV is the “safest choice” – it contradicts what the writer says about eV’s release-after-next, which will handle “64-bits and a new primary indexing engine.” It is obvious to any engineer that when that happens, the user’s entire email archive database will have to go through a complete migration/conversion. And if that happens 2-3 years down the road, the migration of the accumulated data will cost even more than the original acquisition cost! And that’s hoping the new version with transplanted blood (64-bit) and tranplanted heart (new search engine) will work. So, how is eV the “safest choice?”

    There are many more contradictions and inconsistencies in the previous post, too many to go into here. To its credit, Symantec is a masterful marketer. However, when the rubber meets the road, the enterprise customer ends up owning all the problems that were hidden.

  3. Posted September 8, 2008 at 1:56 AM | Permalink

    Hi analyst,

    Archiving software has a 25+ year product life-cycle. After 10 years EV still has to be considered a young product, as most customers that buy today are considering this software to be part of their infrastructure in 10-20 years time.
    EV has journaling in the product from the beginning. Discovery Accelerator was started in 2001 when it was clear to see that legal teams need a specialised search app, which is designed to search multi-threaded, scheduled and persistent, following a notion of “cases”.
    Depending on the complexity of your search it can well take hours for EV to search a very large deployment, but given the number of items and the volume of data this is acceptable. AltaVista is not a slow engine, it just lacks some newer features like regular expressions. That’s it.

    If you read carefully: The first poster says that there are very large deployments, in fact I know of 5 which are around 100.000 seats. The second statement is that you have close to 0 chance of messing up a project up to 5000 seats.
    I read it in a different way: If you have a project larger than 5000 seats, you have to plan properly, get 1st class consultants on board and design and test carefully. This is true for any vendor. So where is the contradiction?

    I would disagree if you think storage reduction is easier than journaling. The security handling of people searching across all repositories they have access to is very challenging and EV has a great security implementation in this respect. Also with mailbox archving you expose the product to the end-users. That’s not trivial.

    I have not heard of any customer using a current EV version to have months to search their data. Never.
    I would be interesting in the metrics you talk about, so that I can verify this in our lab. How many servers, Vaultstores with how many items, what search terms.
    Hard to believe you are talking about a proper EV installation.

    I think your 2nd point is FUD, which is okay in the gossip section.

    The SIS model in EV works flawlessly de-duplicating indentical emails. Now as time moved on people started to compare attachments, something EV does partly when connected to an EMC Centera. I know of 2 vendors that have started to implement attachment hashing in software, so for the market leader it is a major requirement to fllow the trend early.
    So it might be that you are unhappy about the SIS model in the product today, but the guys are busy fixing all these issues for you, giving you one attachment copy per datacenter across all archving sources.

    So while I can see some truth in your 3rd point when you talk about SIS limitations, I can assure you that you’ll have nothing to complain about in a couple of weeks.

    64-bits will be standard for enterprise software in 2-years time. There no choice. So I don’t understand the argument about it. It is just recompiling the software with a few switches changed and doing testing on newer hardware.
    Every vendor has to do this.

    Chances are that there will be an alternative indexing engine while AltaVista will still be included to read the old indexes. I mean the statement from the first poster is a summary of a presentation held at the Symantec Vision conference, so what is missing is the “Subject to change” disclaimer. But the EV architecture allows querying different indexes and federating the results today. And from my talk with Symantec the idea has always been to introduce a next-gen engine that helps to replace AltaVista over time and not in a big bang. Most archiving vendors will require their customers to go through a larger change or conversion in the next few years, as this has been the case with most vendors in the past.
    There is always the fine line between innovation and evolution.

    So your 4th point is based on wrong assumptions on your side. I feel that you, and the original poster, haven’t successfully summarized what’s right or wrong with EV.

    If you have more hidden problems to discuss, feel free to post them – but I haven’t really found one.

    Regards

    Daniel Maiworm

  4. analyst
    Posted September 9, 2008 at 9:25 PM | Permalink

    If I may deconstruct the marketing spin again, this time from Mr. Maiworm ….

    a) You say “it can well take hours for EV to search a very large deployment … this is acceptable.” Enterprise customers, especially for post-FRCP e-Discovery, no longer have the luxury to wait for eV to get around to deliver results the next day or the next week. There are solutions out there TODAY which deliver a random search across an entire database of many hundreds of millions of documents within minutes. This leads to my next point …

    b) eV is NOT a “young” product, as you call it, not with that kind of lethargic, almost catatonic, searching. Let’s examine some facts. When eV first launched, the number of documents in its first installs compared to today’s needs were lower by a THOUSAND-fold. Architecture-wise, no system designed for 1/1000th of the volume can be re-jiggered to scale a thousand-fold. That’s like claiming that a PBX originally built for 100 users can miraculously scale to 100,000 users. Such an increase requires a complete re-design. Curiously, your posting mentions both “young” product and the introduction of a “next-gen” engine. Another non sequitur.

    c) You say my 2nd point about eV’s search speed is “FUD.” In that case, maybe it would easier to simply deal with FACTS. How long would eV take to search for a random term like “harassment” across a database of, say, 500 million documents? It would help if we did not waste time talking around and around the search speed problem.

    d) You claim “The SIS model in EV works flawlessly de-duplicating indentical [sic] emails.” Flawlessly? eV’s storage efficency is practically non-existent. Apart from all the broken SIS instances cited in my earlier post, eV wastes huge amounts of storage by wrapping an HTML copy in addition to the original email. That’s like storing a copy in English, and then writing another copy in Swahili. And, no, the argument from Symantec about HTML representing “future-proofing” doesn’t help when the customer has to double his storage hardware costs. Especially when he knows that HTML itself is a changing format; hence, “future-proofing” is nonsensical marketing spin.

    e) You suggest that my 4th point about eV’s “next-gen” search engine posing a huge migration expense risk for customers is “based on false assumptions.” You argue that “Chances are that there will be an alternative indexing engine while AltaVista will still be included to read the old indexes.” In effect, you are suggesting that customers should be happy keeping NEW and OLD archives running simultaneously. That’s like taking a taxi to work each day, and then ordering TWO taxis to work each day. The administrative overhead for the New Archive will have to be doubled, since the Old Archive will require a separate back-up and restore, suffer additional heartaches arising from hardware/software failures, require constant maintenance and fine-tuning, etc. Incidentally, the Old Archive CANNOT be a “frozen” system, since it still needs to go through the lifecycle changes for each email. In addition, many other complications arise from multiple searches which have to be executed across disparate data stores, which may then have to be de-duped separately, etc., etc., etc. If this is your preferred “evolution,” you may want to ask your customers if they agree.

    By the way, the entire fiasco with the dual archive would be exacerbated by the extremely weak eV architecture which proliferates database instances like there’s no tomorrow. I’d say that for 500 million documents, eV would probably require in excess of 60 databases! I wouldn’t want to be the DBA for that eV installation. I shudder to think of the humongous cost of administration, never mind coordinating backup and restore across 60 databases, especially across global time zones. I would expect you’d have to shut down all write functions during back-up. I wonder if your 24-hour global multi-national customer likes that?

    And, yes, there are many, many more hidden problems with eV. If, as you say, you “haven’t found one,” it would be an amazing feat of tunnel vision. Large enterprises, in particular, need to be more diligent about separating fact from marketing spin. Would you like me to continue?

  5. Posted September 15, 2008 at 3:24 AM | Permalink

    Some more analysis from the analyst … Great!

    A) It will only take hours for very, very complex searches that include multiple (50+) conditions against very large databases of billions of documents that require to search both journal and mailbox archives. No search engine I know of can deliver this in minutes, especially if you need to serve end-users at the same time. Following up to this point: Name a single other product that does the job better. I guess you have to reveal your employer now 🙂

    B) I did not say it is young in the sense of “Fresh and immature”, but compared to the product lifecycle archiving products have, it is by no means old. As I said it has been constantly updated to utilize the current MS technologies (MMC3.0, .NET and C#, Powershell). What in EV do you consider old? If you say old architecture, it does not resonate with me. It is the RIGHT architecture, no matter when it was designed.

    C) Less than 5 minutes, given that you don’t search against Journal archives only, but also against more complex indexes like file-systems and mailbox-stores (which might contain data from PSTs that are currently not int he journal) and given that you have a decent setup with enough processing power and disk I/O.
    If you search journal archives only, this will be significantly less, as Journal indexes are the fastest to process.
    If you are interested in the exact numbers for different setups check the EV performance guide (Google!)
    Symantec does not hide the numbers, they rather publish them to provide sizing information for partners and customers. Just go out and check… (“Accelerator search performance” chapter inside the document)

    d) I don’t know why you believe that you have to split journal and mailbox data. They can go to the same store and have perfect single-instance, even if you use envelope journaling. How many products do provide that today?
    Yes there are two “logical archives”, but they can share one physical store. It makes it easier to handle security on the mailbox differently from the security in the journal.
    I think it is not even worth going into more detail here as your assumptions have been wrong so many times now.

    If you were in the archving business, you’d know that a “long-term” rendition of the content in addition to the original is a key component of the strategy. We can argue if HTML should be replaced by PDF/A or XML in the not too distant future, but having it in the product is a must. Most of the certifications in Europe require you to have such a “rendition”, although most of those had been designed with TIFF and paper documents in mind.
    And the HTML copy is a “must have” when doing things like e-Discovery. The reviewer can see a preview of any attachment in 1 second as opposed to open PowerPoint and wait 20 seconds for the document to load.
    That’s surely a plus if you pay >400$ per hour for a legal advisor.

    e) This thread would be much shorter and more interesting if you read carefully what I wrote.
    No vendor can claim today that he has a search engine technology that will be a leading product in 20 years time.
    Not Symantec, not Mimosa, not Zantaz/Autonomy, no one!
    Therefore you can expect most vendors to update their search technology in the future at some point, sooner or later!
    As there are still customers using “slower” storage like optical and tape (even if they really should not!) you might be in a situation where a recall and complete re-indexing is virtually impossible. Also for large accounts the amount of mail to bring back and re-index is simply too big.
    Therefore the ONLY choice that vendors have is to have co-existence of two search engines. The customer should be able to select the point in time when the new engine takes over and indexes any new items that go to the archive.
    At the same time the old engine is still active to search in the old data. Physically it is the same archive, data is stored once, backed-up once and managed once – just indexed by two engines. The search application will federate the search results, so that you have a seamless experience when searching (Something that EV does TODAY!)
    This has close to zero impact on the admin side, except that it will take processing time in off-peak hours.

    You then have different options as a customer:
    I) Take one archive at a time and re-index the old content with the new engine. This should ideally be scheduled so that you can use week-ends for that. After a few weeks or months all data is now indexed in the new format.
    II) Leave the old content and index until the last items expire.

    The important fact is that you are not forced into a migration at the time of upgrade. You can decide when to do the re-indexing.
    So like I said in my previous post. Your assumptions are plain wrong, therefore you can keep your two Taxis, the heartaches and the “frozen” system for yourself. They have nothing to do with the archiving world that I am a proud part of.

    f?) How do you get to 60 databases? You need one database per Vaultstore (aka one per EV server), plus a central monitoring and directory database.
    Most of our customers have ~5 databases for up to 10.000 users. (Not sure what you were trying out…)
    In general several small databases are easier to manage than one big one. What if your single database is corrupt? What if you run out of disk and want to move one database to a different LUN?
    I haven’t heard any EV customer yet, who thinks that the split of databases is a concern.
    For 500M documents, you would need (depending on SIS) ~200GB database space. MSSQL is capable of holding this in a single database, still I’d probably split it into 3-4 smaller DBs.

    Maybe it is tunnel vision, but I feel like I pretty much debunked every single polemic accusation you made.
    Anyway, let the fellow Ferris readers decide who gives all this a marketing spin and who knows EV first hand!

    I am off now and I will not respond to the same wrong assumptions for a third time!

    Best regards

    Daniel Maiworm
    vcare Infosystems

  6. Posted September 15, 2008 at 3:34 AM | Permalink

    A clarification from Ferris Research: in case it’s not obvious, the commenter with the moniker, “analyst” is not a Ferris Research analyst.

    We are aware of the person’s identity and industry affiliation, but that is of course up to him or her to reveal, not for us.

  7. Posted September 17, 2008 at 9:23 AM | Permalink

    Also please note that Daniel’s comment on Sept 15 got truncated somehow. We apologize.

    His full comment is now available above…

  8. analyst
    Posted September 17, 2008 at 7:48 PM | Permalink

    In the interest of cutting through the thickening and hardening marketing spin, perhaps we can get down to a simple “put-your-money-where-your-mouth-is” validation.

    I understand that you believe and represent that an eV query that searches across all 500 million documents will take 5 minutes or less.

    I say we each put US$10,000 in escrow with David Ferris. I will provide 3 simple searches, i.e., single-word random searches. You provide David access to an eV production site that has 500 million or more documents (supervised access by user should be acceptable). There should be plenty of eV sites that large, if your postings are to be believed. David conducts the 3 searches and averages the time. If it comes in at 5 minutes or less, you take the $20,000 in escrow. Otherwise, it’s mine.

    Separately, I’m sure David will enjoy being the industry arbiter of such a defining archiving litmus test.

    What say you, Mr. Maiworm? No “assumptions” to debate. Simple hard cash vs. cheap talk.

  9. Posted September 18, 2008 at 5:12 AM | Permalink

    I am up for it, but the problem is that no large customer will ever agree to have anyone access (even over the shoulder) to one of their most security critical systems.

    And frankly, I would beat you hands down on this one, by throwing hardware at the problem.
    Just a thought: If I have 10 EV Servers, one Vaultstore on each containg 10 Journal archives with 5M items in each archive, and I split the indexes on multiple spindles e.g. 15.000rpm SAN drives, then I have fulfilled your requirements.
    The EV Discovery Accelerator receives the results from all machines and sorts them. So if we agree not to have millions of hits that we need to sort (unlikely in the real world, as you would put conditions in there) then this process takes a few seconds.
    Each server has in theory 30 sec to search 5M items (10indexes vs 300sec limit). If I run 4 searches in parallel on the server this would be 120secs to come back with results. Easy! (Even Windows Desktop Search delivers around that spees!)
    The scale out architecture and the splitting of index locations allows EV to get to any number, even significantly less than what you talk about. (May I remind you that Google uses the same approach with 100.000+ servers?)
    So if we say 50.000$ than I would go through the pain of setting up an environment that blows the number away, but I doubt that you are really that serious about losing your money!
    If you introduce some more conditions and I need more power, I add another 10 servers. I get your money in any case!

    Daniel

  10. analyst
    Posted September 18, 2008 at 8:03 PM | Permalink

    As I expected, it’s basically no go.

    And you didn’t have to spend so much time detailing all kinds of theoretical performance in a lab, i.e., with gold-plated servers and skating downhill with a tailwind. (He doth protest too much?)

    It is only the production environment which is relevant and credible, not some artificial one manufactured ad hoc, which can be jerry-rigged ten ways to Sunday.

    I also don’t buy your excuse that “no large customer will will ever agree to have anyone access (even over the shoulder) to one of their most security critical systems.” Does that mean you’ve never heard of a customer reference site-visit and a demo on-site? How is that different? Further, the archiving system would not even be required to do the data export (which, by the way, is painfully slow with eV); so where is the data sensitivity breach?

    btw, here’s a reality check. I just spoke with a well-known company today which runs eV for only 5,000 mailboxes and it took them 4 weeks (!) to do a discovery search: 2 weeks to search and 2 weeks to export. (If you disbelieve this, perhaps I can arrange David Ferris to contact this eV customer; that way, David can give us validating feedback on his site without publicly identifying the customer).

    When it comes time to show actual performance, it’s just impressive how Symantec is so adept at dodging and weaving. I do applaud their marketing team, if not the engg team. Unfortunately, for the customer, marketing spin alone doesn’t keep the system up and running.

  11. Posted September 19, 2008 at 7:17 AM | Permalink

    Gentlemen, while this debate is interesting and stimulating, the tone is starting to border on the uncivil.

    I’d hate to see it descend into rudeness. Generally, we don’t like editing comments — please don’t force us to start deleting them.

  12. Posted September 19, 2008 at 10:42 AM | Permalink

    Richi, you’re right – it is just not worth it. So the very last post from me … civilized!

    To analyst:

    With nearly 10000 customers for EV you’ll always find one that has put all indexes for 5000 mailboxes + 50 journaling on one RAID1 disk aggregate, runs a 3 year old version of EV using the end-user search instead of the Discovery Accelerator with a 100MBit connection to the database , or whatever they thought is is a good idea! If you find people, that have been cost-cutting their environment to the point where it is unusable, I can give you also some great examples how badly people manage MS Exchange with the same results. (I hope we agree that Exchange isn’t a bad product per se). Now if I say for finding 500M items in a short timeframe you need more spindles, a diffrent distribution of workloads and (..for heaven’s sake!…) Discovery Accelerator, then this is what you should pay consultants for: Design a solution that meets your requirements. I hev just outlined what you need. If you want a full solution design, I can send you a my daily rate.

    Sure, large global customers especially in finance have no other issues then “Hi, here is an email archiving guy from Switzerland. We have this p$%%&ssing contest going on, it’s a gossip section on an email Website and it is a 10000$ bet. Can you please drop all cases you have with the SEC, the bancrupt broker you’re dealing with and stop the merger talks because we really have to verify your system performance, whether it is 5 or 15 minutes… And yes the guy watching this is David Ferris, an international celebrity in this entertaining space (No offense to you, David!).

    Get serious!

    I am consultant at two banks that 30000+ user deployments – otherwise I would not be standing up and get the facts straight. And “No” – both of them are not willing to let anyone go to their DA, as they have a “6-eyes” configuration where a single person can only either create a case, run a search or see results. This is all monitored by what we call the “works council” over here. (This is Central Europe, where employees have at least some privacy rights e.g. not be fired because of political newsletters you subscribe to.)

    Regarding the export: If you use an Enterprise class storage and DO NOT use *.CAB files for your journal data (To all EV customers: Just DO NOT SWITCH ON THIS FEATURE FOR JOURNAL DATA – it has its use case for very old Mailbox Stores.), the export is at a minimum 1-2GB per hour, significantly more if you have multiple machines.

    I am not working for Symantec although – without any denial – I have a background that explains why I endorse EV. So don’t blame Symantec for “marketing” as they have nothing to do with this. I have not seen any official announcements from Symantec on this site and I doubt they ever will, given the day-time jobs of some of the Ferris consultants at other archiving vendors.

    To the customer:

    If it is you the guy is talking about, send me a mail. Unlike others I use my real name, you can look up my background on Linked-In and do not hide behind some nickname.
    Our company specializes in 1st class Enterprise Vault services and we can surely help you with your search times.

    If you have an environment that has close to 500M objects in the store I can get you 5000$ cash-in rebate (see analysts post above – 50% for me ;-))

    Regards Daniel

    (I am NOT going to reply to this thread anymore -nuff said!)

  13. Barry Murphy
    Posted October 23, 2008 at 3:19 PM | Permalink

    It’s come to my attention that many folks think that “analyst” is me because i used to be an analyst at Forrester and now work at Mimosa, a Symantec competitor. Sorry to disappoint, but it’s not.

    First, were I to post on a blog, I would use my name (as I’m doing now)…it’s cowardly not to. Second, I don’t believe in trashing competitors in a blog. And third, were I to trash Symantec, it would make me a hyporite – at Forrester, I named Symantec a leader in email archiving.

    To David Ferris – I think allowing anonymous posts seriously hurts your credibility as an independent analyst. By not stating a name, it’s clear the poster has someone to hide. As a former analyst, I know that independence is key to having influence and credibility in the market. The way this blog plays out does not make me want to turn to Ferris for more information or opinions; rather, it makes me wonder why Ferris Research is letting this happen on the Ferris site.

    I thought the purpose of these forums is create constructive debate – as the hoster of the site, it’s the job of Ferris to keep things on track. Eliminating anonymity is one way to begin doing that.

    Barry Murphy
    Product Marketing, Mimosa Systems
    Former Forrester analyst

  14. Posted October 23, 2008 at 3:39 PM | Permalink

    Barry,

    Thanks for your observation. I agree, it would be better not to have anonymous postings. Initially, we thought we would not allow such postings.

    However, many people won’t contribute valuable material unless they can do so behind the camouflage of anonymity. I believe we would lose far more than we gain by requiring openness.

    Usually, the person concerned does identify themselves to us internally. Clearly, that’s preferable to complete anonymity, because it’s easier for us to qualify or disqualify them. Also, if we think any points made are false or misleading, we usually suppress them. If you find examples of material that is factually mistaken, or just hopelessly misinformed, please let us know. But do bear in mind that reasonable people can disagree on many things.

    We also try to discourage debate in which the parties speaking disparagingly of each other.

    Bye for now

    david

  15. dferris
    Posted November 14, 2008 at 2:05 AM | Permalink

    We hear that Symantec is laying off most of its Vault sales specialists. Not clear what this implies–perhaps that Symantec is simply transferring the sales role to its mainstream sales team.

    Can anyone help further?

    –david

  16. dferris
    Posted November 27, 2008 at 3:28 AM | Permalink

    For discussion of the layoff, see our later blog posting at http://email-museum.com/?p=321680

  17. Nigel Dutt
    Posted December 24, 2008 at 2:46 AM | Permalink

    It seems like a significant time to make some comments here because it is exactly 10 years since Enterprise Vault V1.0 was first shipped – in fact it was Christmas Eve 1998. This was about 18 months after the initial “Powerpoint implementation” that was used to sell the idea and get the internal funding for the project. And before anyone leaps in and talks about “old code” I’ll hasten to add that just like Washington’s axe, there is probably not a single line of code left from V1.0 in the newest V8.0.

    I can’t claim that Enterprise Vault was the first product used for email archiving but I suspect it was the first dedicated email archiving product. It was certainly true that for the first year or two our main goal at shows and customer meetings was explaining what email archiving was and convincing people that they needed it at all. Within a couple of years it became a recognized product category and the question shifted to convincing people who now knew they needed email archiving that it was our product that they needed.

    Oddly enough, while the Exchange team at Microsoft always supported archiving partners, they never admitted to the need for archiving to solve the problem of mailbox and message store growth, but only recognized it as a journal repository for regulatory driven retention. This was in spite of the fact that for all the time I was directly involved in the product (until the end of 2005), mailbox archiving was always the biggest selling option.

    Having said that, there is a myth that regulatory-driven archiving only arrived later, around 2002, but in fact it was always a major target for EV right from that first Powerpoint implementation, but required Exchange 5.5 SP1, which was the Exchange release that first delivered journaling. Although the journal option was selling well right from the start, what is true is that the need for it became much more of a compelling event once the SEC started handing out very large fines for non-compliance, and the possibility of jail even loomed. Suddenly worried looking legal guys and compliance officers started showing up in initial customer meetings, and then add-on products in the compliance and discovery areas started popping up.

    What still impresses me about EV is that there is a core of software engineers in the development team who have worked on the product since day 1 (and most of them had worked together for several years before that) even though it has now had five different company logos on the box. In the same way that EV’s original strong architecture has allowed its implementation to be rolled into the 21st century in an evolutionary way, its team has expanded to include new tools and techniques and, dare I say, new and younger engineers!

  18. Posted December 24, 2008 at 3:28 AM | Permalink

    Nigel–Thanks so much for this posting. Great to have you with us.

    Everyone–Nigel was the original developer/architect/brains behind for EV. I first met him at Digital Equipment in EV’s first incarnation. Now he’s retired and enjoying his well-gotten gains–people like him motivate the rest of us.–david

  19. Nigel Dutt
    Posted December 24, 2008 at 9:13 AM | Permalink

    Thanks David,

    I’ll quickly add that I was never the architect and certainly not a developer for EV – those days were over for me even then! Derek Allan, a much younger man, was the architect from day 1 and he now leads the engineering team. I guess my role was always that of the “Enterprise Vault Chief Technical B**l-S****er”!!!!!

    Cheers – Nigel

  20. Posted January 7, 2009 at 5:48 AM | Permalink

    Hi Nigel,

    I am not surprised that you sent your post on Christmas eve..
    Eileen and you should sometimes lean back and rest assured that you have changed many people’s professional and personal lifes through your “insane” level of work and dedication to (a piece of software called) Enterprise Vault.

    You deserve a David’s credit, without any doubt!

    Daniel

  21. Andrew Barnes
    Posted January 29, 2009 at 7:14 AM | Permalink

    I just came across this thread, very enlightening. Nigel I’d dispute you on the B*llsh*t, your were always moderating others:) I’d call you CBC (Chief B*llsh*t Controller)

  22. Heath
    Posted February 3, 2009 at 9:34 AM | Permalink

    In the blog section of the Mimosa website, they are belittling EV’s use of technologies….

    To quote from their blog….

    EV Continues to Struggle with Indexing
    One of the little known facts about Enterprise Vault is that its Automatic Classification Engine (ACE) is based on technology from Orchestria. Did you see the announcement that Orchestria was purchased by Computer Associates last week? This not mean that ACE will be disabled anytime soon, I suspect that any contractual agreement between Orchestria and Symantec will be honored by CA.

    But, it raises the question regarding Alta Vista (AV) which is the search engine used by EV. The reason that EV needed Orchestria for classification is because AV was not up to the task. As we all know, AV has been EOL’d since 2003 and it sorely out-of-date. Symantec is still left without a robust index engine for EV that can support all the modern archiving tasks such as retention and classification, as well as search. When will it replace AV?

  23. Posted February 6, 2009 at 5:36 AM | Permalink

    I think the first posts in this thread have discussed the Pros and Cons of AltaVista in detail and I think Mimosa has very little reason to belittle EV. There are more interesting blogs out there comparing the maturity of both solutions.
    Categorization is a different, though adjacent technology to Full-text search. Symantec has a lot of expertise in this space both in their Anti-Spam, as well as in their DLP/Vontu) group, so don’t worry there…

    It has been stated above that a new index engine would co-exist with Altavista for several years, so that customers will not need to re-index or worry about operational impact.
    The problem is really to select the right engine for another 10 years right now and SYMC is actually lucky that they did not go for some other EOLed technology a few years ago (Convera, FAST).
    Bottom line: They have a number of really clever people in this area, a lot of expertise and I am sure they’ll come up with a great solution once the dust has settled and a sustainable engine has been thoroughly tested and validated.

    Regards Daniel

  24. Posted March 12, 2009 at 2:35 PM | Permalink

    Posting from someone who wants to be anonymous:

    Discovery accelerator may have been a web page in 2001- but KVS didn’t design the interface we see now until 2005 and it was amateur at best. They have zero subject matter expertise and Symantec still refuses to admit, see, or realize the impact of not having a sophisticated search engine has on the cost of pulling data from the achieve. The idea behind an archive is to decrease the time and cost of getting data out to respond to a regulatory or legal request. Pulling back hundreds of gigabytes of data, most of which will be irrelevant and having to dedupe, cull, and process defeats the purpose of having a single repository.

    How does Symantec reduce the set of data being pulled if they don’t do SIS, and how do they reduce that set of data once it’s out.

    They are missing some very crucial elements and processes that are now the norm. They have a long way to go and they would do themselves a lot of good to bring in a subject matter expert and integrate a next generation search tool.

    They move slowly and they are falling behind more nimble players very quickly.

  25. Deborah Johnson
    Posted March 12, 2009 at 5:54 PM | Permalink

    As a point of clarification Symantec only licensed 50 policies from Orchestria and classification is entirely different than search.

  26. Posted March 21, 2009 at 4:51 PM | Permalink

    Hi David,

    regarding your last post:

    True, the early versions were “amateur” as nobody back in 2003 knew what the impact of a large legal case would have on the underlying architecture. Back then, KVS had very little understanding of how legal departments work, BUT:

    That’s 6 years ago… Today Discovery accelerator has sold more than 2000 times and is used by some of the largest financial institutions. The DA in Version 8 was redesigned from the ground up and I am involved in a project where we’ll be dealing with nearly a Petabyte of data. So far, the new DA has exceeded every single requirement (incl performance) we got from legal.
    And this is one of the tier-one golbal banks.

    So saying they have not managed to build up the skills is simply FUD. If anyone does not believe DA rocks, try it..

    We have been discussing the AltaVista story in lenghts already. You can read all the Pros and Cons above, so let’s just move on.

    Why on earth would DA pull back 1000 of Gb of data?? If you put in the right search terms, you get exactly the amount of data has to be reviewed. Normally you get a list of custodians and search terms and run the search, then review, then export.

    De-duping is only necessary (and is transparently done in the background) if you have multiple copies from multiple journal mailboxes. That normally happens when you have multiple Exchange 2003 Servers and databases, where each Exchange Database will deliver a copy to a different Journal mailbox. (Mind that EV will single-instance those identical mails on the storage!)

    That’s why Microsoft has changed Journaling in Exchange 2007 to significantly reduce the number of Journal copies through “Premium Journaling”. This is an Exchange 2003 Problem, not an EV one…

    Maybe your anonymous source can shed a light on “SIS”, or what he thinks that is. Let me explain what the rest of the industry thinks SIS is:

    Mail-level SIS: Storing only one copy of an identical mail
    Attachment-level SIS: Storing only one copy of an Attachment, even if it is used in different Mails
    Cross source Attachment-level SIS: Storing only one copy of a mail attachment and/or a file server file and/or a Sharepoint document if they are identical.

    per Store SIS: Single Instance will be only done for Objects within the same Archive Store
    per Server SIS: SIS will be done for Objects on the same Archive server
    per Datacenter SIS: SIS will be done across different Servers, keeping only one copy per Datacenter,

    Just to make it 100% clear: EV is to my knowledge the only solution that does Cross-source Attachment-level SIS per Datacenter. One copy of any PDF File in your datacenter, no matter if it was archived from different E-Mails (Attachment) or if it was archived from Sharepoint or from a NetApp Filer.

    If your anonymous source thinks there is a better way of doing SIS then please shed some light on this … (P.S.: EV Supports Block-level Dedupe Hardware from NetApp and Data Domain … just in case you even want to De-Dupe identical 8KB Blocks)

    Once you realize that Symantec has completely revamped DA and changed to a .NET WPF Interface for allowing things like Dual-Monitor Reviews, you see that your “anonymous” source is definitely the subject matter expert on anything but Enterprise Vault and Legal Discovery.

    They move steady, they design sustainably and they enhance their product dramatically with every release.
    The only thing that most competitors (like your source) bring to the table is FUD and wrong assumptions.

    I’d really like to hear REAL FEATURES, where others out-innovate Symantec, but there’s hardly anything I heard from anyone in the market that is a great and original idea (except Zantaz when they were still actively developing their EAS product back 2 years ago and the big architectural bet from Mimosa to use TA shipping – before I found out how this means 5-10 times the EV 8 storage – Maybe Chuck Arconi can shed some light on this)

    The poster says they are missing features that are the norm: Please give us 3-5 examples and I’ll stop posting on this forum.

    Enough said.

    Daniel

  27. Martin Tuip
    Posted March 29, 2009 at 1:09 AM | Permalink

    Daniel,

    EV is not the only product that does SIS cross a datacenter on an attachment level. EV only is capable of doing this since EV 8.0 came out while many other products like for instance Mimosa Systems NearPoint have been doing this for quite some time. I can understand your enthousiasm regarding the fact that EV has this now .. just making sure that you understand that this is already common stuff these days.

    Regarding the storage footprint of Mimosa, one has to clearly understand the differences between the way Mimosa creates the databases during install and follows the appropriate Microsoft best practices (i.e. creating the database ahead of time presized. Adding data to a presized database gives better performance than what EV does which is an insert and an expand of the database). I’ve seen vendors claim 20 times the storage .. it all depends on the time of the month and the size of the moon these days. Besides that .. comparing NearPoint to EV on some things is like comparing apples and oranges.

    Chucks opinion is always going to biased since he sells and installs Enterprise Vault, didn’t he praise Mimosa NearPoint for the longest time and have webcasts on it saying how good it is? The same counts for my opinion on Mimosa or yours on EV .. we both earn our living from these products at the moment.

    A lot of stuff in EV 8.0 is nothing new under the sun .. the high praised SIS is one of them.

    As always … best regards,

    Martin

  28. Clarke
    Posted April 2, 2009 at 5:11 PM | Permalink

    Mimosa trying to pull the wool over our eyes…..

    Please tell me the advantage of claiming datacenter SIS and yet committing an increase of 300-500% storage demands before you even archived 1 single item, that just seems stupid.

    Pre-bloating of databases CAN give a little performance advantage but this is not the default for Exchange, SQL etc. These technologies work just fine without pre-bloat.

    If your SIS is so good and you compress the data surely it will be years of processing before you reach 300-500% of the original source data footprint, yet you are asking your customers to provide/maintain/backup/pay for/cool/power up/ unused storage from day one. Personally I’d like to buy storage over time to take advantage of reduced storage costs and increased performance…..plus it is a whole lot more green!

  29. dferris
    Posted April 3, 2009 at 10:59 AM | Permalink

    From an anonymous source who is usually quite well informed:

    I’ve heard a rumor that EV is going to replace AltaVista with a search/index engine from a French company. I presume this might be Exalead?

    H&S and Messaging Architects use Exalead

  30. Posted April 6, 2009 at 10:54 PM | Permalink

    Hi Martin,

    interesting opinion! So Chuck is biased coming from Mimosa, now promoting EV although he could choose whatever product he wants working for a reseller?

    You are ex-KVS/Symantec, ex-Quest, now Mimosa (full-time employee) and you think you clarify the situation with your post?

    Well, no! Mimosa should not even take part in the SIS discussion as it has to do with storage efficiency, the biggest loophole in your current offering. Technically you might do SIS, but the resulting storage foot-print (at least in our lab) is appalling.

    You cherry-pick “weaknesses” in EV and once they are fixed a few months later, you complain that the solution is not unique and original. Who cares. The breadth and depth of EV, its maturity and enterprise-class development have made it the leader in this space and I am interested to see Mimosa fixing its very own problems in the same sustained and predictable manner.

    As you can see above, the next iteration of EV is most likely taking another big FUD point from your list. Tough times for competitive marketing, mate!

    Daniel

  31. Martin Tuip
    Posted April 8, 2009 at 5:16 AM | Permalink

    I presume then that you acquired the appropriate legal licenses for those tests Daniel in your lab. If you did purchase our software to test than I would like to thank you for your purchase 🙂

  32. dferris
    Posted April 21, 2009 at 10:49 AM | Permalink

    We’re hearing through the grapevine that Kazeon is in acquisition talks with Symantec. Still just a rumor at this point… David Ferris

Post a comment

You must be logged in to post a comment. To comment, first join our community.