Words of Advice Posted By Daniel Kaiser, Esq. on September 1, 2010

From your Mentor… all the judges of the Seventh Circuit

To all the young Chicago attorneys out there getting ready to argue a case, want a tip or two from your judge first? Who could be a better mentor? Don’t worry, no political corruption is involved.. The Seventh Circuit Bar is gearing up to deliver these helpful tidbits through their new e-mentoring project.

A product of the Association’s Young Lawyer’s Committee, the project currently offers video-based mentoring sessions with over 45 judges and senior trial lawyers from throughout the circuit. Even better, the plan is to further develop this library into a collection including every Seventh Circuit district and magistrate judge. It used to be that only a lucky few had access to the knowledge held by these jurists. It’s still a lucky few: those who are members of the Seventh Circuit Bar Association. But hey—progress!

For a teaser video visit the Circuit’s webpage at http://www.7thcircuitbar.org/
Or, for a quicker sampling, according to Circuit Judge William Bauer, “We should not be so involved with our cases and our clients that we lose our own humanity and our knowledge or belief of what’s right and what’s wrong . . . When it’s appropriate, keep your mouth shut.”

Comments

Post A Comment

Logik — The Future of Law

The Future of Law Posted By Daniel Kaiser, Esq. on August 23, 2010

Where is this all going? Nicholas Parrella ran an article this summer in the New York Bar Association’s State Bar News speaking to the future of our profession. In it, Parrella discusses New York State Bar President Stephen P. Younger (Patterson Belknap Webb & Tyler LLP) and his new “Task Force on the Future of the Legal Profession.”

These days there has been a growing undercurrent of academics, practitioners and consultants trying to predict what the legal profession will become. Consultants Arthur Greene’s and Sandra Boyer’s article titled “Professional Staffing in the 21st Century” emphasize practice management, and propose that the continued delivery of quality legal services will require a growing emphasis on finding and retaining support staff. This emphasis includes the use of paralegals to take on the more routine legal services, reserving the complex and higher-level work for the lawyers. Other practitioners have endorsed the growing importance of “eLawyering.” That is, a web-based presence ranging from basic (web-sites, preferably more than informational sites) to sophisticated (web-based law firms).

Younger states that “[a]fter weathering one of the worst years in recent memory due to the economic downturn, bar leaders across New York and globally are cognizant of the need to fundamentally change the way we as attorneys do business . . .” To that end, Younger’s new task force will focus on harnessing new technologies, training new lawyers, developing the means to a work-life balance in the face of a perpetually plugged-in virtual office, and reforming law firms. Regarding the “reforming law firms” prong of the task force’s focus, the billable hour has been pegged as problematic and potentially not in the clients’ best interests. As for the technologies prong, a special subcommittee of the task force will look specifically at technologies of the future and how to harness them.

Beam me up Scottie.

Comments

Post A Comment

Logik — Selling, We Were Born To Do It Right, So Why It Is So Difficult?

Selling, We Were Born To Do It Right, So Why It Is So Difficult? Posted By Todd Eastman on August 13, 2010

We were born to sell, right?  When you need something as a little kid you beg, plead, insist, put on a sad face and hopefully you get what you want, right?  My daughter is not even four and she has mastered the art of selling.  Walking by the candy store she’ll say, “Daddy, can we get some candy?” My first response is a quick “No, keep walking.”  What does she do?  She starts her sales process with her body language.  She slowly drops her head all-the-while looking at me out of the corner of her eye.  She slumps her shoulders, arms hang loose, and BAM!  Dad is sold right into the candy store without her uttering a word.  Selling is an innate skill we all possess to some degree, and there are countless ways to approach it.

Most people are selling to their parent’s, family, and friends their whole lives.  To play sports, to hang out with certain kids, to work, not to work, college choices, career choices, dating the right person.  It’s endless.  Selling is an inherent process that all people are involved in everyday.  Yet, why are a lot of people in the sales profession so bad at sales?

Let’s look at the eDiscovery industry.  The average customer in this market gets bombarded with cold calls, spam, mailed literature, knocks on the door…and did I say cold calls?  Sitting with one of my clients for the course of an hour would make your head spin.  The phone is ringing off of the hook. Why? Because they don’t answer. Because they simply don’t have the time to answer all the calls they get.

Now this can be viewed as a good or bad problem to have.  Good because they have the pick of the litter for services and product offerings, and bad because it is very difficult to manage the overwhelming need for salespeople to connect with them.  Without products and services, most of us would be out of a job or unable to do our jobs effectively.  What should be discussed is approach.  Approach is key to successful selling and getting a customer to consider you and your company’s products or services.  It is those with a bad approach that sours our customers and potential customers, and ruins it for the rest of us.

You hear the horror stories all the time.  “Johnny the sales guy at XYZ vendor sneaked into our office and was handing out business cards.”  The one I like the best happened to me when a lady called for our CEO.  She said “Hi this is Amy, I recently spoke to your CEO’s assistant and she told me to call for him today at 10:00 AM.”  Well, the funny part is, our CEO doesn’t even have an assistant.  It was an old school tactic for getting to the decision maker.  Rather than playing games that turnoff prospective clients, use a professional approach to selling and try to always keep it that way.

Let’s take a look at some very common methods for gaining the attention of a prospect and ask ourselves, do they work or not:

Cold Calling, does it work?  I have one question, do you like getting cold called? No you say?  Okay next…

Cold Emailing, does it work?  Sure. While probably not the most successful, it at least gives someone the chance to respond to you on their terms, whereas cold calling does not.  The cold call is an unplanned, unexpected interruption to ones day.  Advice for cold emailing: keep it short, sweet, and to the point.

Networking, does it work?  Absolutely. Networking is one of the single most important reasons that sales people are successful.  Unless you have the most amazing new product say (Viagra) and an insane amount of demand (tons of men over 50), you are going to have to get out there and really work at selling your product or service.  Networking is a great medium to meet potential and current clients and give your elevator pitch in the hopes of a follow-up meeting.  Networking also leads to referrals, which, in asking any successful salesperson, will tell you they get a lot of really good referrals.

Put yourself in the customers shoes.  When someone calls you out of blue trying to peddle you something, how do you feel? What are your actions? How would you liked to be approached?  When you have the answers to those questions, you probably have an idea of what type of selling you should be doing and what will be successful for you and your product or service.

Feel free to post your sales highlights, horror stories, or best practices so others can enhance that which we all must do in one capacity or another—SELL!

Good luck, have fun selling, and don’t forget what you can learn from a toddler!

Comments

Post A Comment

Logik — Polish up your scripts with Optparse

Polish up your scripts with Optparse Posted By Adam Reilly on July 28, 2010


If you’ve ever written an especially useful or popular script, you’ve noticed that features tend to creep into the codebase as you encounter variations in the input.  As code evolves to handle more and more variation, you may notice that distinct ‘modes’ of operation arise.  One way to accomodate these different modes is to use values hard-coded into the source.  Examples such as field delimiters, input path, recursive operation and output paths are often wired directly into the operation of quickly-written scripts. 

If you’re the only one that runs your code, and this solution may be perfectly workable.  However, distributing scripts to colleagues, clients or trying to plug it into a larger framework will quickly reveal it’s shortcomings.  Users may have a difficult time determining which values to change, or inadvertently introduce errors.  So, this approach is error-prone at it’s worst and cumbersome at it’s best.

In this post, we’ll discuss a module in the Python standard library called ‘optparse’.

Optparse makes your life easier (and your scripts more usable) by providing objects and methods that automate the process of building well-defined and documented command line interfaces or CLIs.  Instantiating an object and providing a few input values is all you need to provide consistent, well-documented interfaces for any script you write.

The first step is to import the necessary objects.  Optparse ships as part of the standard Python library, and imports with the following:

from optparse import OptionParser

This makes the OptionParser object available to the script.  We’ll create a new instance of OptionParser by using Python’s constructor syntax:

 

parser = OptionParser()

Parser is now instantiated and ready to be populated with options, via calls to its add_option method.  In order to properly illustrate the remaining usage, we’ll introduce a simple example and work with it for the remainder of the article.

Example: Folder size detector

Let’s assume that we want to create a script which will recursively count the number of files contained in a folder and any of it’s sub folders.  This is fairly simple to implement using the walk generator function within Python’s os module.  The following line is not directly relevant to optparse, but probably bears a more in-depth discussion:

 

for root, dirs, files in os.walk(dir):

Walk is a special type of Python method called a ‘generator’ which computes a small piece of a larger results and returns it in steps.  In this case, the partial results are lists of subitems our script encounters as it traverses each folder in a directory structure.  Use of a generator in this context is very efficent, as Python is only storing one step of the traversal at a time.  Traversing a large directory (say your C:\ drive, for instance) without generators would consume a very large amount of memory.

Walk makes it easy to determine file counts across all subfolders, because filenames are returned as a Python list.  We can use the len() function to get the length of the list, thus obtaining our file count.  Since the for loop is executing in steps, it is necessary to declare a variable outside of the loop to hold our result.  The completed syntax is as follows:

 

subFiles = 0

for root, dirs, files in os.walk(dir):
  subFiles += len(files)

The for loop will iterate until it has traversed ‘dir’ and all it’s subdirectories.  As it visits each folder, it will increment the running total number of files.

Back to Optparse

This script is functional, and potentially useful in a few different scenarios, however, it only gives a summary file count for the top-most directory.  We could trivially modify the script and add the ability to print out file counts for all subfolders within the tree.  Using optparse allows us to easily add a command line interface which preserves both modes.

The following paragraphs will scratch the surface of the optparse module by walking through two different examples.  To get the full picture, read through the documentation.

 

parser = OptionParser()

parser.add_option(”-v”, “—verbose”,
      help = “Recursively print file counts for this folder and all subfolders”,
      action=“store_true”,
      dest=“verbose”)

This syntax adds an option to the ‘parser’ instance of OptionParser.  The first line provides a list of switches which your script will accept.  These will be familiar to OSX/*Nix users as short and long options.  On the next line, the optionparser accepts a help string which is used to provide a brief description of the option and it’s usage.  The action value can be one of several predefined strings (in this case a variable is set to true.  Finally, the dest argument specifies which field within the ‘parser’ instance will receive the value.

 

parser.add_option(”-o”, “—output”,
            help = “Specify name of file to write summary”,
            metavar = “OFILE”,
            action=“store”,
            type=“string”,
            dest=“outFile”)

             
This example is similar in that it allows the user to specify short and long options, a help string and an action to perform on a destination.  In this case, a string is stored in a field named outfile.  The concept of a metavar is also used.  Metavars are used when you want to provide the user with an intuitively named destination for a value that is not the same as the option names.

Putting it all together

 

# Import the necessary objects from
# the python standard library
from optparse import OptionParser

# used to import the walk function
import os
import sys

# Options is declared globally so that it will be available
# to the entire script without being passed around.  It will
# be populated with data later
options = “”

# OS walking will be wrapped into a function
def countFiles(dir, destination):

  # Declare counter to aggregate results
  subFiles = 0

  # os.walk returns a string and two lists
  #  current_dir ->  name of directory being explored
  #  dirs ->  subdirectories of current_dir
  #  files -> list of files in the current dir
  for current_dir, dirs, files in os.walk(dir):

      # increment the counter with the number of files
      # in the current directory
      subFiles += len(files)

      # If ‘verbose’ field is true (i.e., -v or—verbose is
      # used in the CLI invocation) the script will print out
      # any intermediate directories and file counts
      if(options.verbose):
        destination.write(current_dir + “: ” + str(len(files)) + “\n”)
     

  return subFiles


if __name__ == “__main__”:

  # Use the constructor to create a new option parser
  parser = OptionParser()

  # Add option aliases, documentation strings and behaviors
  #  This will set a field named ‘verbose’ to true if it is used
  #  on the command line
  parser.add_option(”-v”, “—verbose”,
              help = “Recursively print file counts for this folder and all subfolders”,
              action=“store_true”,
              dest=“verbose”)

  parser.add_option(”-o”, “—output”,
              help = “Specify name of file to write summary”,
              metavar = “OFILE”,
              action=“store”,
              type=“string”,
              dest=“outFile”)

  # parse_args returns two values
  #  options -> hash containing the state of flags from the CLI
  #  args -> any positional arguments encountered after parsing options
  (options, args) = parser.parse_args()

  # Check to see if an output file was specified
  # if not, use standard out (print to the console)
  if(options.outFile != None):
      results = open(options.outFile, ‘w’)
  else:
      results = sys.stdout

  # Write the final result to the console or file
  results.write(args[0] + “: ” + str(countFiles(args[0],results)) + “\n”)

Invoking the script with options: “

-o out.txt—verbose /path/to/directory ” will create a text file in the current directory which contains the results of exploring “directory” and all of it’s subfolders.  Alternatively, using: “/path/to/directory” will calculate the partial results silently and the print the summed total to the screen.  Finally, using “—help

“ will print a summary of the operation:

Usage: dirCounter.py [options]

Options:
  -h,—help         show this help message and exit
  -v,—verbose       Recursively print file counts for this folder and all
                subfolders
  -o OFILE,—output=OFILE
                Specify name of file to write summary

A little forethought, and modules like optparse make it easy to create user-friendly and self-documenting CLI scripts.

Comments

Post A Comment

Logik — So you Want to do Business in Boston?

So you Want to do Business in Boston? Posted By Daniel Kaiser, Esq. on July 12, 2010

Never mind dropping your Rs, how’s your WISP?

And no, I don’t means lisp.. How’s your Written Information Security Plan?

Vigorous identity theft regulations introduced by the Massachusetts Office of Consumer Affairs and Business Regulation (201 CMR 17.00 et. seq.) requires any person or business that owns or licenses (receives, maintains, processes or accesses) personal information about a resident of the Commonwealth of Massachusetts to meet minimum standards in safeguarding that personal information—whether in paper or electronic form. Such parties must develop and implement a Written Information Security Plan to protect personal information in a manner fully consistent with industry standards and other applicable laws and regulations.

In this case “personal information” is defined as a Massachusetts resident’s first name and last name or first initial and last name in combination with any one or more of the following data elements that relate to such resident: (a) Social Security number; (b) driver’s license number or state-issued identification card number; or (c) financial account number, or credit or debit card number, with or without any required security code, access code, personal identification number or password, that would permit access to a resident’s financial account; provided, however, that “Personal information” shall not include information that is lawfully obtained from publicly available information, or from federal, state or local government records lawfully made available to the general public.

So what do you need to do? A few highlights pulled from the Regulations’ obligations include:

  • Write, implement, maintain and monitor a “comprehensive written information security program”;
  • Appoint an Information Security Manager or committee who will help your staff carry out the Information Security Plan and audit compliance regularly;
  • Provide written notification when you know or have reason to know that there has been a security breach; and
  • Dispose of personal information in a manner that precludes access to or reconstruction of the documents.

While this brings a little extra in the way of administrative oversight, it’s certainly doable.. and worth it. Like Massachusetts health care reform, it’s likely a sign of things to come. Just look at the attention Facebook and Google have been enjoying, look at the fees paid to IT Security consultants, and there’s no arguing it: privacy’s stock is rising.

So check out the specific administrative, physical and electronic security measures required by the Regulations. Because if you’re not doing business in Boston now, you may be soon.. and then you’ll be wicked late!

shameless promotion: Logik complies with Massachusetts’ standards wink

Comments

Post A Comment

Logik — Social Media takes on SLAPP-happy litigation

Social Media takes on SLAPP-happy litigation Posted By Daniel Kaiser, Esq. on July 1, 2010

Meet Justin Kurtz, an undergrad with more than 12,000 friends. Facebook friends. 12,000?! Why the popularity? Apparently Kurtz knows how to take a “SLAPP”.. a Strategic Lawsuit Against Public Participation that is. Or at least Kurtz and his 12,000 friends hope he knows how.

The story here is about a clash of two titans—the corporate or government plaintiffs willing to litigate to force a vocal critic to back down vs. the little guy channeling the power of social networks.

Mr. Kurtz states that, although he held a parking permit, a local towing company disabled his car alarm, towed his car from his apartment complex, and levied a $118 fine for the car’s release. Counsel for the towing company said the permit wasn’t visible. Either way, this kind of thing tends to irritate a car owner who feels wronged. Looking for a way to get his story out, Kurtz created a Facebook page called “Kalamazoo Residents against T&J Towing.”[1] Enter the power of social media: within two days of its creation 800 people had joined the group. It might have stopped there, but where Kurtz saw free speech, the towing company saw defamation… to the tune of $750,000 in damages.

The criticism against SLAPP suits is that often the plaintiff doesn’t necessarily even expect to win their fight in court—they expect to use their deeper pockets to pressure the defendant to clam up and avoid an expensive lawsuit. The strategy quite often works, but in this case, with the help of social media, the equation seems to have shifted. Kurtz’s collection of friends grew rapidly from 800 to over 12,000, and other social media outlets (Google maps Reviews , Yahoo! Local, and Yelp) have been enlisted with endless stories (true or otherwise) of apparent encounters with the plaintiff. Kurtz himself claims to have done nothing wrong: “The only thing I posted is what happened to me.”[2]

All this grass-root attention lends support to The Citizen Participation Act (H.R. 4364),[3] a federal bill sponsored by Representative Steve Cohen of Tennessee and Representative Charlie Gonzalez of Texas. The purpose of this proposed legislation is to enable a defendant who believes himself or herself to be the victim of a SLAPP to petition for dismissal of the suit. As enumerated by the Public Participation Project, the bill includes the following provisions:

1.    Immunity for Petition Activity (for all petition activity performed without knowledge or reckless disregard of falsity);

2.    Protections for Petition and Speech Activity (in connection with an issue of public interest);

3.    Federal Removal Jurisdiction (in consideration of the fact that roughly half the states have no similar anti-SLAPP provisions, to allow for removal to federal jurisdiction when the defendant claims the defense of immunity under the pending Act);

4.    Special Motion to Quash (to protect anonymous speech when the anonymous speaker’s personally identifying information is sought);

5.    Fees and Costs (including reasonable attorney’s fees, for the party who prevails on a special motion to dismiss or quash);

6.    Bankruptcy Non-Dischargeability of SLAPP and SLAPPBACK Awards (making fees awarded non-dischargeable in bankruptcy for successful SLAPP defendants and for defendants who are allowed to recover damages incurred in defending against a SLAPP); and

7.    Exemptions (non-applicability of the bill to claims brought solely in the public interest or from advertising speech, to protect against abuse of the statute).[4]

[1] http://www.facebook.com/group.php?v=wall&gid=288159562692
[2] http://www.nytimes.com/2010/06/01/us/01slapp.html
[3] http://www.anti-slapp.org/?q=node/16
[4] Id.

Comments

Post A Comment

Logik — There’s an App for That

There’s an App for That Posted By Daniel Kaiser, Esq. on June 21, 2010

There is soon to be an app for just about everything. Thankfully, the legal community isn’t about to be left behind. Looking for a few apps that can actually make you smarter? Here are a few in no particular order that have caught my attention over the past month.

LawStack

This stack is much more fun than the law stacks you remember from law school, and the app gives you more than simply a touch of sophistication in your iPhone. The free version comes pre-loaded with a nicely indexed collection including the US Constitution, the Federal Rules of Appellate Procedure, the Federal Rules of Bankruptcy Procedure, the Federal Rules of Civil Procedure, the Federal Rules of Criminal Procedure, and the Federal Rules of Evidence. You can add federal and local statutes to your stack via the app store, but those come along with an additional fee.

 

 

 

PocketJustice

Just like LawStack, this app avoids the use of the space bar. Also like LawStack it’s free! PocketJustice brings you concise abstracts of US Supreme Court decisions along with voting alignments and personal bios for all of the current and former Supreme Court Justices. The free version comes loaded with the top 100 constitutional law cases complete with media content including oral arguments. As always, you can pay for more if you want to: the current price for the entire Supreme Court constitutional law canon is $4.99.

 

 

 

BarMax

Then there’s the not-so-free end of the spectrum. I can’t easily think of a new law school grad who hasn’t shelled out for a complete bar review course. That hasn’t gone unnoticed. This team of Harvard lawyers means it when they say “Max.” So far as I’m aware, this app is the biggest (over a gigabyte of lectures, outlines and checklists) and the first to hit the four-figure price tag—One Thousand Dollars. Well, $999.. But hey, who’s counting? Especially when the other major bar prep courses will run you four times that amount? Even still, there’s something nice about the word “free”. BarMax thought of that.. and is offering a free MPRE app to get you hooked.

Comments

Post A Comment

Logik — Should Wexis Fear?

Should Wexis Fear? Posted By Daniel Kaiser, Esq. on June 17, 2010

Maybe it’s because I’m a legal research fan, maybe it’s because I like a deal, but William Manz’s recent article[1] in the New York State Bar Association Journal is pretty darn cool. The thing is, I’m not quite sure which part is the coolest.

Old, archived legal records and briefs have long been accessible only to lawyers and researchers who are willing to pay. As Manz points out, microfilm or microfiche records of New York Appellate Division cases only go back as far as the early 1970s. If you want to dig back a bit deeper, good luck! Google is working together with the Law Library Microform Consortium (LLMC) to change all that.

LLMC has been at work collecting archived materials in their physical format from law libraries. This is just the start, but it’s no small job. The New York Bar Association library worked for more than half a year, from January to August 2009, just to remove their volumes from basement stacks. From there Google has been arranging the shipment of materials to Googleplex in Mountain View, California, where their own high-speed scanners digitize them and record their size, author and publication date. Ultimately these volumes will be freely accessible via Google Scholar, complete with Google’s user-friendly search engine functionality.

The fun doesn’t end there. Sadly less accessible to people like—me—the original copies of the newly-scanned materials are being sent to Hutchinson, Kansas for permanent storage with Underground Vaults & Storage, Inc. Describing itself as “one of the most secure and elusive underground storage facilities in the world,” this secure facility uses portions of a salt mine 650 feet below the surface. We’re talking about more than 1.7 million square feet of storage space in cool, dry storage, where our legal heritage will live along with Hollywood archives and Cold War era documents. Their website boasts of “biometric scans, video cameras, redundant authorizations, steel vault doors, blind passwords, anonymous storage, restricted personnel access, infrared monitors, and more that we cannot reveal.”[2] As I understand it, access would even be a challenge for James Bond.

So should Wexis fear? Google isn’t exactly brand new to the game—Google Scholar has been gradually developing in content and functionality. But considering all the bells and whistles Westlaw and Lexis bring to the table, they’re not likely to be hurt. Those at risk, perhaps, are the collection of second-tier legal research services that cater to smaller firms and solos.

[1] William H. Manz, Recent Developments: Records and Briefs, N.Y. ST. B.A. J. 82, Feb. 2010, at 47.
[2] http://www.undergroundvaults.com/aboutus/hutchinson.cfm

Comments

Post A Comment

Logik — Bootstrapped, Profitable, & Proud: Logik via 37signals

Bootstrapped, Profitable, & Proud: Logik via 37signals Posted By Andy Wilson on June 10, 2010

We were featured in 37signals’ Bootstrapped, Profitable, & Proud series this week. Here is a piece of the story…

From http://37signals.com/svn/posts/2385-bootstrapped-profitable-proud-logik

This is part of our series “Bootstrapped, Profitable, & Proud” which profiles companies that 1) have $1MM+ in revenues, 2) didn’t take VC, and 3) are profitable.

Q&A with Andy Wilson of Logik

What does your business do?
Logik helps companies find, organize, process, and make searchable terabytes of digital documents for legal discovery. I always say we sell digital aspirin to attorneys experiencing discovery migraines.

How successful is your business?
Financially, we’ve been very successful considering our size relative to the competition (most have close to or well over 100 employees, we have 16). We don’t reveal internal financials now, because 1. we are private and 2. we don’t want VC’s beating down our door anymore after what happened with the 2009 Inc 500 ranking.

With that said, from 2005 to 2008 we grew revenue by 1,067% from $373,866 in 2005 to $4.4 million in 2008 with about $3 million in profit. We did that with 8 employees, a ton of servers, niche software, and 1 dog. This, minus the profit, is all public information now. We were ranked #181 overall on the 2009 Inc 500 survey and #1 for eDiscovery companies.

Getting on the Inc 500 was a great marketing tool for us, because it helped some of our more skeptical, on the fence, customers realize we were indeed legit despite our small size. Although we don’t reveal financials anymore, we have doubled our company size, moved to a new, bigger, and more open office space closer to our customers, and we are hiring for more engineers and support staff.

How did you get started?
In 2004 Sheng and I met for some quality Chinese food in Virginia to discuss what would become Logik. Prior to Logik, we were working for a small legal printing company helping to destroy rain forests. No seriously, we would print hundreds of thousands of emails to paper, so that massive legal teams could manually review each page. Very efficient (odd fact: I worked on the Microsoft antitrust litigation and at one point was printing out Bill Gates’ email for a few weeks. He is very long winded.) After a few years of doing this and inhaling enough toner to paint your entire house black a few times over I decided I needed out and to find a better way to solve this problem. I mean, why would you “print” electronic documents to paper? Why not just print them to PDF or TIF? Ah-ha!

So, after letting the Chinese food settle we got to work drawing out the process flow for our document processing software. We quit our jobs, cut back on expenses, leased some servers, and got to work. I have a CS background, but Sheng is the real engineer and created the first version in just a few months. We got our first real customer 9 months after we started. This is how we got started.

Logik founders Sheng Yang and Andy Wilson.

How did you fund yourself at first?
Savings and credit cards. Our total startup costs were less than $20,000. Funny enough, we are still funding ourselves in the same way, but with a lot more savings and less credit card debt.

Did you ever consider taking on any investors?
Yes. We created a presentation, met with a dozen or so well-known investors, and then decided to scrap the idea all together. We realized that it didn’t make any sense to give up the control of our company for money. We couldn’t even figure out what we would spend the money on if we got more of it. So, we decided that, for us and for our culture, it would be best to keep growing organically. It was probably the best decision we’ve made yet.


To read the rest of the story on 37signals.com Click Here

Comments

Post A Comment

Logik — Mike Giroux Joins Logik as VP of Sales

Mike Giroux Joins Logik as VP of Sales Posted By Andy Wilson on June 1, 2010

Say hello to Mike Giroux, our new VP of Sales. Mike comes to us from Autonomy, one of the world’s largest eDiscovery companies, where he was a Sales Director. How Mike came to Autonomy is quite an interesting story of acquisitions. First, in July of 2008 Interwoven acquired Discovery Mining for $36 million. Mike was one of the earliest employees at Discovery Mining and helped build revenue from a few million to over fifteen million during his time. Second, not long after Mike was getting situated at Interwoven, Autonomy came along in January of 2009 and scooped up Interwoven for a cool $775 million. So, in the span of less than one year Mike worked for three different eDiscovery companies. All of them through acquisition.

So, why leave Autonomy, one of the world’s largest eDiscovery companies, and come work with Logik, a much smaller outfit when compared to Autonomy? Simple, Mike wanted to help build and be a part of a smaller team that was building something great. Mike brings over eight years of eDiscovery sales experience to Logik. He’s worked with some of the world’s largest companies and law firms on some of the biggest and most complex eDiscovery projects on record. Mike actually started working at Logik last week and has already started making an impact.

We are very happy to have Mike on our team. Drop Mike a line and say hello at .(JavaScript must be enabled to view this email address) or connect with him on LinkedIn.

Comments

Post A Comment

Logik — The Perks of Flying Solo

The Perks of Flying Solo Posted By Daniel Kaiser, Esq. on May 13, 2010

Thinking of becoming what The American Lawyer[1] calls the Lone Wolf? Recession and all, the trend shows that many lawyers think this is the perfect time to go solo—or to go boutique. It comes down to two attractive perks: value for clients and autonomy for lawyers.

In terms of value, there’s nothing to match the experienced lawyer who decides to go small. These days even big clients are looking to capture economies in new places, and solo attorneys and start-up firms are reaping the benefits. Anchored by attorneys with field experience in larger blue-chip firms, these smaller players are becoming known for delivering conventional big-law quality coupled with unconventional flexibility in terms of billing. Smaller office space (with freedom from long-term lease agreements), pared-down staffing (making use of contract attorneys and virtual paralegals as needed), alternative billing schemes (using flat fees or bonus fees for successful outcomes in place of billable hours), streamlined legal claims and subbing in autonomy and ownership in the place of higher profits are all elements contributing to the rise of the Lone Wolf.

When it comes to flying solo, the growing numbers haven’t been missed by the American Bar Association. The ABA points out that over 30% of American attorneys are now solos, and yet fewer than 7% of these solos are members of the Association.[2] Wanting to capture that segment, the ABA will be cutting annual dues in half for solo practitioners as of September 1, 2010.

Of course not just any graduate is ready to go solo… for those new to the profession the notion of solo practice might just descend to the level of malpractice. Small firms require a full skill set ranging from legal experience to marketing abilities. But for those lawyers established enough to be able to bring clients along with them, small can be golden. Efficiency is the new black.

[1] For a full write-up on this topic see The American Lawyer, Economy Model, 57-61 (Feb. 2010).
[2] ABA Journal, ABA Halves Dues for Solos, 65 (March 2010).

Comments

Post A Comment

Logik — ESI in ADR? It’s your call.

ESI in ADR? It’s your call. Posted By Daniel Kaiser, Esq. on May 4, 2010

Recent blog posts have been popping up talking, with some alarm, about the rise of eDiscovery in ADR (Alternative Dispute Resolution). The idea seems to be that a once-friendly method for tabling business disputes is potentially being hamstrung by the encroachment of eDiscovery into the process. Granted, arbitration as an institution has developed or “matured” to such an extent that the old arguments for a faster, cheaper dispute resolution process often don’t ring true. But should the legal and business communities be alarmed?

With increasing frequency it’s becoming the reality that if you want to consider evidence at all, you’ll be considering electronic data. The argument goes that eDiscovery is today simply discovery… with an “e” appended to the front. This is easy to see when you look at the growing numbers of businesses of all sizes storing their records and communications primarily or exclusively as ESI (Electronically Stored Information). Think E-mail and spreadsheets. Enough said.

Without a doubt, ESI will be a growing presence in ADR. This is a natural and necessary progression in the history of arbitration (along with various other dispute resolution methods), and without this development arbitration would become somewhat archaic and ultimately hobbled.

Does this mean that the flexibility and speed associated with ADR is dead? No it does not. Party autonomy, the ability to affect the scope and dimensions of your dispute resolution agreement before you sign it, will remain a fundamental element of ADR. Such autonomy is only lost when you’re dealing with unbalanced parties–such as when Party A says to Party B “if you want to do business with me, you’ll sign this dispute resolution agreement.” These scenarios are common, but in such scenarios Party A has always had the power to call the shots–including the power to choose traditional litigation with everything that entails. Whenever two companies deal with each-other as equal Parties they maintain the ability to design a dispute resolution clause with an array of options. Some of those options should certainly be the extent to which ESI will be handled, and how it will be handled. This requires direct discussions between the Parties covering what is necessary or reasonable and in what circumstances.

An extremely common practice found in dispute resolution clauses is the rote adoption of the sets of rules developed by one or another of several dispute resolution or arbitration organizations. These include, among others, the American Arbitration Association, the Chartered Institute of Arbitrators, JARS, the International Chamber of Commerce, and the London Court of International Arbitration. Fortunately these institutions have been developing their own guidelines and protocols for addressing eDiscovery in the ADR context, and many of these protocols reflect an approach to eDiscovery that maintains greater constraints and streamlining than those found in traditional American civil litigation. As an example, take a look at the following clause entitled “Electronic Documents” taken from the ICDR (International Centre for Dispute Resolution) Guidelines for Arbitrators Concerning Exchanges of Information:

When documents to be exchanged are maintained in electronic form, the party in possession of such documents may make them available in the form (which may be paper copies) most convenient and economical for it, unless the Tribunal determines, on application and for good cause, that there is a compelling need for access to the documents in a different form. Requests for documents maintained in electronic form should be narrowly focused and structured to make searching for them as economical as possible. The Tribunal may direct testing or other means of focusing and limiting any search.[1]

[1] Taken from http://www.adr.org/si.asp?id=5288
Image Borrowed from: http://www.hobokenattorney.com/lawyer-attorney-1131734.html

Comments

Post A Comment

Logik — ASAP Ale - World’s 1st eDiscovery Beer

ASAP Ale - World’s 1st eDiscovery Beer Posted By Andy Wilson on April 23, 2010

Stressful, crazy day? Our beer makes it all ok! Introducing “ASAP Ale,” the world’s first eDiscovery beer. With the success of Redaction, which was our entre into the world of hand-crafted libations, we figured making beer was the next logikal step for an eDiscovery vendor…naturally.  A nice, cold one after an especially grueling day in eDiscovery can definitely put a person at ease—which is why a nicely stocked fridge at all times is a must.

The way we see it, we wouldn’t buy a 3rd party eDiscovery software to ease our client’s processing problems (HELLO Gridlogik!) so why should we drink someone else’s beer when we can make our own? ASAP Ale is an Imperial Ale weighing in at 8% alcohol, so it’s not for the faint of heart. It will be bottled in the middle of May with our fancy label slapped on it. Unlike Redaction’s monthly giveaway, the only way to get this prize is by stopping by our office and saying hello. We take our jobs and our clients very seriously, but we also like to kick back with a well-deserved cold one after a hard drive’s work (snort snort.)  So don’t be shy, loosen that tie and come on by! 709 G Street NW, Washington DC 20001.

Comments

Post A Comment

Logik — Project Manager @ Wiley Rein

Project Manager @ Wiley Rein Posted By Adam Reilly on April 21, 2010

Thank you for your help. You make us look good. Thanks!

Comments

Post A Comment

Logik — Capturing TIFF metadata

Capturing TIFF metadata Posted By Adam Reilly on April 16, 2010


Image Courtesy of: http://www.cksinfo.com/clipart/construction/tools/magnifyingglasses/magnifying-glass-black-handle.png

Building from the same basic structure as the file system metadata gatherer (http://logik.com/whats_new/entry/capturing_file_system_metadata/), we can incorporate functionality to pull information from within the file.  Once documents have been reviewed and produced, it is very common for them to be converted from their native or ‘dynamic’ form into a more static page-oriented form such as a TIF image.  When the number of pages in a production approaches the millions, it becomes impossible to check every file for small details like compression, page orientation and resolution.  Using the ‘for’ loop from the previous example and incorporating a third-party will make it possible to quickly generate a useful summary of all TIF images in a folder.

PIL

The Python Imaging Library (PIL - http://www.pythonware.com/products/pil/index.htm) is a general purpose image manipulation library for Python.  It has classes and methods to parse, load and manipulate images in several different formats.  We will demonstrate a very small part of the overall functionality, and it is well worth glancing through the documentation to figure out what else is possible. 

All of the required files are bundled into a Windows installer which can be obtained from the Pythonware website (http://www.pythonware.com/products/pil/index.htm).  Be sure to download the version that is appropriate for your particular installation of Python.  Once the installer has completed, we’re able to import the Image class with  

from PIL.TiffImagePlugin import TiffImageFile

We’ll modify the Glob loop so that it will only grab files with a TIF extension and then use that list as the variable in the for loop.  Note the slight difference between this example and the prior one (you’ll probably find yourself repeating or reusing patterns from time to time).

fileList = glob.glob(base_path +
"*.tif")

Then, we can use the image’s open method which takes a path and creates an in-memory representation of the image stored at that path.

im = Image.open(file)


If nothing’s gone wrong, the only thing left to do is access properties and methods of the im variable to get a summary of properties for every TIF image in the folder.  In this case, we’ll be interested in the compression, resolution orientation and the number of pages.  PIL has built-in methods to handle most of these pieces of information.

 

 

 

 

 

 

 

 

Field

 

 

Information/Format

 

 

im.field

 

 

Returns a string containing the format of the current image.  “TIFF”
  in all these examples

 

 

im.size

 

 

Returns the dimensions of the image as an ordered pair of pixel
  values

 

 

im.info

 

 

Returns a dictionary with different fields depending on the image
  type

 

 

Page Count

PIL does not have a built-in method or property for counting the number of pages in a file, so we’ll have to define our own.  First, we’ll take a brief detour into a general programming topic called “Exception Handling.”  The Image class in PIL has a method called seek() which accepts an integer as an argument and attempts to open that page.  Trying to seek to page 35 in a one page document will cause the script to enter a special state known as an exception.

Look before you leap

Exceptions occur when programs do something that is unexpected or undefined.  For instance, many languages have the notion of a “divide by zero” exception in case code causes it to do so.  Exceptions are different from program crashes in that code which is likely to raise an exception can be wrapped inside special blocks of code which will try to perform the operation, detect an exception if it occurs and then execute cleanup code in order to allow the program to keep executing without crashing.  In Python, this special code is known as a try/except block. 

Since we have no way of determining where a particular document ends, we can take advantage of the fact that seek throws an exception.  Essentially, we’ll just keep trying to move to the next page until an exception is raised.  The following function keeps track of the number of pages successfully accessed with a counter variable. 

def tifPageCount(tif):

 

    pageCount = 1

    try:

        while(1):

            tif.seek(pageCount)

            pageCount += 1

    except EOFError:

        pass

 

    return pageCount

 

Putting it all together

Here’s the full working code:

# imports functionality to enumerate files

from PIL.TiffImagePlugin import TiffImageFile

import glob

 

# imports functionality to read command line arguments

import sys

 

import Image

 

# Using a provided Image object, continually seek to the next page until

# an EOFException is raised.  Keep track of the successfully encountered

# pages with a counter variable

def tifPageCount(tif):

 

    pageCount = 1

   

    # Code in this block will execute until the end of the image file

    # is reached

    try:

        while(1):

            tif.seek(pageCount)

            pageCount += 1

    except EOFError:

        pass

 

    #try/except has completed, return the count

    return pageCount

 

 

if __name__ == "__main__":

 

    base_path = sys.argv[1] + "\\"

 

    # The glob module will find all files on a certain path

    # which match the pattern provided (in this case we’ll use

    # *.tif to match only tif images

    fileList = glob.glob(base_path + "*.tif")

 

    # Store the delimiter in this variable for convenience

    d = "|"

   

    # iterate over the list of tiff files

    for file in fileList:

       

        # create an image object

        im = Image.open(file)

 

        # pull releevant information out of the image object

        imgFmt = str(im.format)

        imgSize = str(im.size)

        imgInfo = str(im.info)

       

        # Call the page counting method

        numPages = str(tifPageCount(im))

       

        # access filed

        print(file + d + imgFmt + d + imgSize + d + imgInfo + d + numPages)

 

Running this code in a folder with single-page tifs folder yields the following results:

ABC0131816.tif|TIFF|(2550, 3300)|{'compression': 'group4', 'dpi': (300, 300)}|1

ABC0131817.tif|TIFF|(2550, 3300)|{'compression': 'group4', 'dpi': (300, 300)}|1

ABC0131818.tif|TIFF|(2550, 3300)|{'compression': 'group4', 'dpi': (300, 300)}|1

ABC0131819.tif|TIFF|(2550, 3300)|{'compression': 'group4', 'dpi': (300, 300)}|1

ABC0131820.tif|TIFF|(2550, 3300)|{'compression': 'group4', 'dpi': (300, 300)}|1

ABC0131821.tif|TIFF|(2550, 3300)|{'compression': 'group4', 'dpi': (300, 300)}|1

 

This information can be used to quickly identify any abnormalities with compression, resolution or page orientation.  Additionally, it is useful in determining page counts within a folder.  This could easily be ingested into a database or column-oriented processing program like Excel as an effective and thorough QC technique.

 

Comments

Post A Comment

Logik — Capturing File System Metadata

Capturing File System Metadata Posted By Adam Reilly on April 4, 2010

This script will be a little shorter than some of the previous examples. However, it represents a fairly common use case within the field of eDiscovery.  As data moves from party to party in the collection/preservation stage of a matter, related files are often lumped into folders according to organizational need.  Summaries of the information in these folders are often crucial to everything from formulating a review strategy to determining timelines.  In this post, we’ll look at a technique for capturing file system metadata and collecting it for reporting purposes.

Glob

There are many different ways to traverse the file system with Python.  One of the simplest methods uses the Glob module (http://docs.python.org/library/glob.html).  The somewhat unintuitive name is a throwback to early Unix days, and refers to the process of finding all strings that match a particular pattern.  To make Glob available to your script, it is first necessary to add the appropriate import statement to the top of the source file.

import glob

 

Then, it’s possible to make calls to the glob method within the glob module (just go with it) in order to start building lists of filenames.  Once files are stored in lists, we’re free to start capturing information we care about.  In order to build the list, we’ll use the following syntax:

fileList = glob.glob(base_path + "*")



Notice that we are supplying a “*” wildcard along with a base path.  This causes the script to navigate to the folder we specify and make a list of any filenames that fit the pattern.  This pattern matches everything, but we could just as easily use stricter conditions such as “*.tif” (all TIFF images) or “*\\natives\\*” (only files in a natives folder), depending on our specific task.  Note that Glob can only operate on file or folder names, and it will only return results for the current folder, not its subfolders.

Stat

The stat module (http://docs.python.org/library/stat.html) is named after yet another throwback to Unix.  It is the name of a system call which was used to retrieve very detailed information about files in a file system.  We will use a subset of its capability to capture MAC (modified, accessed, created) times and size for every file in a directory.  Stat is an object in the OS module, which must be imported with:

from os import stat

This line allows calls to the stat function like the following

fs_metadata = os.stat(path_to_file)

fs_metadata receives the result of the stat function, which consists of several pieces of metadata from the file whose full path is supplied as an argument.  For purposes of this demonstration, we will access and save the size in bytes and the various times associated with each file in a folder.  Once the assignment has occurred, it is possible to access various fields of information using the dot notation.  For instance, accessing the files size is accomplished by accessing the “st_size” field.

sizeInBytes = fs_metadata.st_size

This will save a positive integer for later, when we print to a summary.  Accessing the file’s MAC times is similar, as we can see from the example of accessing modified time (created and accessed will be demonstrated in the full source listing at the end of the article.

modTime = fs_metadata.st_mtime

Working with Timestamps

There is one final loose end to tie up before the report will be satisfactory.  Times reported by the stat module are stored internally as timestamps, or the number of seconds that has elapsed since a specific date.  If we were to print any of the MAC times without modification, they would look something like “1258124917.17”.  While this is perfectly suitable for sorting or comparison, it’s not very intuitive for human consumption.  Fortunately, it’s fairly easy to implement a function which takes floating point numbers and converts them to a wide variety of date strings.  Indeed, Python has date, time and datetime classes which split these entities into accessible fields and provide many methods for manipulating them.  For brevity and simplicity, we will convert our MAC times to the ISO format combined date and time format (http://en.wikipedia.org/wiki/ISO_8601#Combined_date_and_time_representations).  This captures both the date and time and combines them into a string which will sort correctly in Excel.

After importing the datetime module, we can write a function which performs the necessary Math to convert a floating point number to a into a datetime object and calls it’s isoformat() method.

def floatToTime(timestamp):

    return datetime.fromtimestamp(timestamp).isoformat()

Putting it all together

Here’s the full working code:

# imports functionality to enumerate files

import glob

 

# imports functionality to harvest filesystem metadata

import os

from stat import *

 

# imports functionality for converting times

from datetime import datetime

 

# imports functionality to read command line arguments

import sys

 

# Function which takes floating-point style timestamps and converts

# to an ISO-stlye string (YYYY-MM-DDTHH:MM:SS.Ms).  These dates will

# sort properly if imported into a column-oriented data store like

# MS Excel

def floatToTime(timestamp):

    # use fromtimestamp method to convert the floating point

    # number into a ‘datetime’ object, then call that

    # object’s isoformat() method to give back a formatted

    # string

    return datetime.fromtimestamp(timestamp).isoformat()

 

if __name__ == "__main__":

 

    base_path = sys.argv[1] + "\\"

    # The glob module will find all files on a certain path

    # which match the pattern provided (in this case we’ll use

    # * to match everything

    fileList = glob.glob(base_path + "*")

 

    # fileList has a list of files and folders that match the pattern

    # we will iterate over each in this for loop

    for file in fileList:

 

        # stat takes the full path to a file and returns an

        # object that contains many useful pieces of filesystem

        # metadata.

        fs_metadata = os.stat(file)

 

        # This if statement guards the print statement so that fs_metadata

        # will only be printed if the entry that we’re on is NOT a directory

        # In other words, information should only be printed out for files.

        if not S_ISDIR(fs_metadata.st_mode):

           

            # We’ll capture the file size by accessing a field of the

            # fs_metadata object.

            sizeInBytes = fs_metadata.st_size

           

            # We’ll access three fields from the fs_metadata object to

            # capture Modified, Accessed and Created times from the filenames

            # in the list

            #  Note: the times are stored as a floating-point timestamp, so

            #  we will use the conversion function to make it slightly more

            #  human-readable

            modTime = floatToTime(fs_metadata.st_mtime)

            accTime = floatToTime(fs_metadata.st_atime)

            creTime = floatToTime(fs_metadata.st_ctime)

 

            # Finally, we’ll print all the values into a delimited format that

            # programs like Excel should be able to read easily

            print(file + "|" +

                  str(sizeInBytes) +"|" +

                  modTime + "|" +

                  accTime + "|" +

                  creTime)

 

Running this code in the “C:\Python31\DLLs” folder yields the following results:

C:\Python31\DLLs\bz2.pyd|68096|2009-08-17T17:03:50|2009-10-20T16:37:52.281250|2009-08-17T17:03:50

C:\Python31\DLLs\py.ico|19790|2007-12-06T08:47:58|2009-10-20T16:37:52.265625|2007-12-06T08:47:58

C:\Python31\DLLs\pyc.ico|19790|2007-12-06T08:47:58|2009-10-20T16:37:52.265625|2007-12-06T08:47:58

C:\Python31\DLLs\pyexpat.pyd|152576|2009-08-17T17:04:36|2009-10-20T16:37:52.281250|2009-08-17T17:04:36

C:\Python31\DLLs\select.pyd|11776|2009-08-17T17:04:46|2009-10-20T16:37:52.281250|2009-08-17T17:04:46

C:\Python31\DLLs\sqlite3.dll|302080|2009-08-13T19:57:14|2009-10-20T16:37:52.328125|2009-08-13T19:57:14

C:\Python31\DLLs\tcl85.dll|867328|2008-11-06T20:29:16|2009-10-20T16:37:52.343750|2008-11-06T20:29:16

C:\Python31\DLLs\tclpip85.dll|8192|2008-06-12T18:15:40|2009-10-20T16:37:52.343750|2008-06-12T18:15:40

 

This data can be redirected from the command prompt or written to a file and imported cleanly into Excel.  Note that we used “|” as a delimiter, as it cannot appear in Windows path strings. 

 

Comments

Post A Comment

Logik — Post your Redaction taste results to Corkd

Post your Redaction taste results to Corkd Posted By Andy Wilson on April 2, 2010

It took almost 2 years to fully taste our wine and boy were we pleased when we finally did. Redaction packs a powerful punch of sweet fruit and oaky flavors with a hint of that good-ol fashion Zin spiciness. At 14.5% alcohol, with deep red legs, you may want to take it easy sipping this Zin.

Some notes about the wine: the grapes we selected come from the Grist Vineyard in the Dry Creek Valley and are used in some of the highest quality brands in California. We wanted the wine to reflect our eDiscovery technology and service, unique and carefully handcrafted.

For all the people we’ve shared this wine with, we hope you enjoy it as much as we did! Please share your own experiences with Redaction on Corkd at:

http://corkd.com/wine/view/113667-2008-redaction

Comments

Post A Comment

Logik — Our 10 Core Values

Our 10 Core Values Posted By Andy Wilson on March 31, 2010

Most companies have core values, but few actually live by them. At Logik we truly live by our values in everything we do. It’s what drives us to be better. It’s how we recruit, hire, and promote.

1. Trust your team

The best relationships are built on mutual trust. At Logik we build trust by taking risks, sharing your responsibilities, and letting people do what they think is good for the company (but not reckless). Without trust in your team, very little can be accomplished. And that’s no way to run an efficient company.

We trust our employees to get their work done. This means, they don’t need to be “seen working” in the traditional manufacturing-floor sense. We don’t expect you to get in to work “on time” or even come into the office everyday. So long as you get your work done in a timely manner and contribute to the success of the company, by all means, work as you please. You don’t need to ask permission to attend your kids’ field trip or take a 2 hour lunch with a friend. Just do it, we trust you, which is why we hired you.

So, take risks, share some of your responsibilities, work when you want (but get your work done), and trust your team.

2. Work with our clients, not just for our clients

No one likes to be an order taker (well, maybe some do, but not us). Taking orders kills creativity, which isn’t helpful for clients in need of our services. Instead of working for our clients we need to work with them. It’s why they hire us. They realize that we can help them remove the stress and frustration with eDiscovery.

By helping our clients with their problems they can have a more productive day and hopefully a better work-life balance. Many of our clients work under very stressful conditions with long work hours. We can’t change the culture they are in, but we can help make their lives just a little bit easier. That’s what they pay us to do. We do that by working with them, not just for them.

3. Be creative, but also practical

Efficiency comes from creativity and everyone at Logik is creative, thus we are also (usually) efficient. We thrive off of creativity and it comes from so many places. Eating dinner, walking to work, or in your sleep you may have that next great idea you need to share with everyone, because it’s just that awesome. We love that. Some ideas, like “what if we trained bald eagles to deliver and pickup our customers data” are indeed VERY creative, but not very practical. So be creative, but also be practical in your creativity. We will all be better for it and likely more efficient.

4. Show passion and determination

Working at a company without passionate and determined employees is simply NOT fun. We believe work should be fun and challenging. Working in the eDiscovery industry isn’t akin to working on “the cure”, but that doesn’t mean we can’t be passionate about what we do.

Our core passion comes from making our clients happy, which means we are determined to solve their more challenging gigabyte-related problems. We are also intensely passionate about technology and how we can use technology to solve complex eDiscovery problems. Being passionate about what you do pushes you to be more creative, helpful, and focused.

5. Don’t be annoying

These days, in such an always-on and demanding society, the last thing we want to be is annoyed. We get that. Which is why we strive to not be annoying to ourselves and to our clients. We don’t hold unnecessary meetings to just hear ourselves talk. We don’t interrupt people hard at work with questions that could be answered later. We don’t say ASAP. We don’t do a lot of things we think are annoying and pointless.

For clients, we ask “would this annoy me if I were the client?” If the answer is yes, then most likely it will annoy our client, so we probably shouldn’t do it. Pretty simple concept, but so many businesses create layers and layers of bureaucratic B.S. These layers end up annoying the very people they are trying to serve. Ironically those businesses end up 1. alienating their customers or 2. going out of business. Well, we have no intentions of becoming aliens or going out of business.

6. Be flexible

We are a young company, so our “bones” haven’t yet set all the way. We are flexible in how we work, how we approach problems, how we treat our employees and customers, and how we grow our company. Being flexible is important in any company, because sometimes you need to throw pride aside and do what’s right for the team. Stubbornness for stubbornness sake is never rewarded.

As we grow the company we will change directions, add new services, kill old ones, and do new and interesting things for our employees and customers. We ask that everyone at Logik understand this and adapt as necessary. Because in the end, it’s the companies that can adapt and change quickly that survive.

7. Continue education

We all learn a lot at Logik. We learn how to run an efficient business, how to develop applications, how to work with demanding clients, and even how to play a mean game of ping-pong. Even though we learn a lot of skill on the job, we encourage everyone to learn off the job. So take classes, teach classes, read books, or volunteer in your community. Just do something to keep feeding that big brain of yours that doesn’t involve work. Well-rounded people make well-rounded employees.

8. Be humble

As kids, we were told not to brag and not to showboat. We were told to simply be humble. As grownups, there is no one there to remind us all to be humble everyday. At Logik, everyone around you is humble. It’s a kind of quiet confidence we all carry. We know we are good at what we do, but we don’t need to shout it from a mountain top, so to speak.

9. Focus on quality

In service and in product quality trumps everything. Having quality service leads to happy employees, which leads to happy clients. Having quality product leads to less defects, which leads to happy clients. Our job is to make happiness happen, so we value quality above speed and price in order to achieve this. This doesn’t mean we are tortoise-slow or over-priced. That would be annoying and we don’t play the annoying game (see above).

10. Create a positive family-like culture

Caring about each other is essential to any positive company culture. At Logik we support each other, we help each other out, and we understand that family is very important. Which is why we hire for culture fit first and resume second. We all want to work with people we enjoy working with. It makes working less like work and more like play.

 

Comments

Post A Comment

Logik — Uncle Sam to Project Managers: “I Want You”

Uncle Sam to Project Managers: “I Want You” Posted By Daniel Kaiser, Esq. on March 31, 2010

Ben Bain’s article in Federal Computer Week is worthy of a read. His article highlights the Office of Inspector General’s (OIG) most recent report to Congress – a report including the ten most significant challenges faced by the National Archives and Records Administration. This top-ten list reads, for the most part, like a wish list of the skills and resources in high demand here in the world of eDiscovery:

1.    Electronic Records Archives – How is the government’s document retention coming along? The OIG finds the agency’s success uncertain, stating that the system “has experienced delivery delays, budgeting problems, and contractor staffing problems.”

2.    Records Management – Yet again, we’re talking about the issue of mushrooming electronic records.

3.    IT Security – We’ve mentioned in this blog before that IT Security is the new goldmine in today’s and tomorrow’s economy.

4.    Public Access to Records – With NARA’s roll in declassification of public records, think document productions on a massive scale.

5.    Storage Needs – Across the spread of federal agencies, NARA is looking to ensure compliance with record storage regulations.

6.    Preservation Needs – Here we’re looking at an issue that is only getting bigger: a growing backlog of records to preserve.

7.    Project Management – Technical oversight to ensure results and budget adherence is always in demand.

8.    Physical and Holdings Security – Here we’re talking about the security of staff and records against natural and manmade disasters.

9.    Contract Management and Administration – Teams of contractors working for NARA mean management and oversight challenges.

10.  Workforce Issues – The OIG speaks here of the need to assess the agency’s needs so that they can hire and retain people with the necessary technical abilities.


The skill set sounds pretty familiar, doesn’t it?

Comments

Post A Comment

Logik — Redaction, the world’s 1st eDiscovery wine, is here!

Redaction, the world’s 1st eDiscovery wine, is here! Posted By Andy Wilson on March 18, 2010

After almost 2 years in the making Logik Redaction is here. Redaction is the world’s 1st eDiscovery custom made wine. It’s a red zinfandel from Dry Creek Valley, CA. Weighing in at 14.5% alcohol with hints of vanilla and oak, it is a very tasty Zin (we had a wine tasting at 10am today and we’re what you call “experts”).

We made this wine for ourselves, friends, family, and our amazing clients. Tomorrow we will draw for the Redaction case giveaway, cross your fingers. If you get to taste Redaction, please let us know what you think about it. You may even find it in some local DC wine bars, so keep an eye out. Cheers!

Comments

Post A Comment

Logik — The Cloud Computing Advancement Act?

The Cloud Computing Advancement Act? Posted By Daniel Kaiser, Esq. on March 15, 2010

The question: Is it just a pep talk to encourage someone else to act? Or does an actual draft of the proposed bill exist somewhere in Microsoft’s corporate legal department?

In January, Brad Smith (General Counsel of Microsoft) spoke at Washington, D.C.’s Brookings Institution Policy Forum here in Washington. Mr. Smith came to Washington to speak with academics and industry leaders about something dear to Microsoft’s heart – cloud computing.[1] Urging the importance of a “safe and open cloud,” a need more recently underscored by the heavy impact of shadowy hackers on cloud computing, corporate security and international relations, Smith urges the passage of new legislation to “promote innovation, protect consumers, and provide the Executive Branch with the new tools needed for a new technology era” including the following:

  • Improvements in privacy protection and data access rules to ensure users’ privacy, starting with reforming and strengthening the Electronic Communications Privacy Act to clearly define and provide stronger protections for consumers and business;
  • Modernization of the Computer Fraud and Abuse Act so law enforcement has the tools it needs to go after malicious hackers and deter instances of online-based crimes;
  • Truth-in-cloud-computing principles to ensure that consumers and businesses will know whether and how their information will be accessed and used by service providers and how it will be protected online;
  • Pursuit of a new multilateral framework to address data access issues globally.[2]


It’s interesting to note that Mr. Smith pitched alternative approaches to regulating the industry: an industry level self-regulatory code vs. federal law administered by a federal agency (such as the F.T.C.). To me, this suggests that no bill yet exists to present to Congress. I have no doubt that when it’s ready to go it will be getting plenty of attention.

[1] The full text of Mr. Smith’s speech can be found here: http://blog.seattlepi.com/microsoft/library/20100120smithspeech.pdf

[2] Elements taken from http://www.microsoft.com/presspass/press/2010/jan10/1-20BrookingsPR.mspx.

Comments

Post A Comment

Logik — Getting SaaSy with your Vendor

Getting SaaSy with your Vendor Posted By Daniel Kaiser, Esq. on March 10, 2010

Choices, choices.. Trying to decide amongst all those competing SaaS Providers?

Today’s post is a direct hat tip to Joshua Poje, attorney and Research Specialist with the ABA’s Legal Technology Resource Center (LTRC), and coauthor of the LTRC’s legal technology blog “ABA Site-tation.” In January Mr. Poje brought us The ABCs of Cloud-Based Practice Tools, including this list of 18 key questions to ask a SaaS Vendor before signing on the dotted line. Several of these questions apply equally to a potential IaaS Provider. It pays to ask a few questions, so go on, get SasSy!

Drumroll, please:

  • Do you offer a trial period or demo of your product?
  • What training options are available for customers?
  • How often are new features added to the product?
  • How many attorneys are currently using your product?
  • What hours is your tech support available?
  • Do you offer a Service Level Agreement (SLA) and/or would you be willing to negotiate one with me?
  • What types of guarantees and disclaimers of liability do you include in your Terms of Service (TOS)?
  • How do you safeguard the privacy/confidentiality of stored data?
  • Who has access to the firm’s data?
  • Have you ever had a data breach?
  • How often, and in what manner, is users’ data backed up?
  • What is your company’s history – e.g., how long have you been in business, and where do you derive your funding?
  • Can I remove or copy my data from your servers in a non-proprietary format?
  • Where does the data reside – inside or outside of the United States?
  • What happens to the firm’s data if the company fails?
  • Do you require a contractual agreement for a certain length of service (e.g., 12 months, 24 months)?
  • What is the pricing history for your product? That is, how often have rates been increased?
  • Are there any incidental costs I should be aware of? [1]

[1] http://www.abanet.org/lpm/lpt/articles/ftr01103.shtml

Comments

Post A Comment

Logik — Harmful if Swallowed

Harmful if Swallowed Posted By Daniel Kaiser, Esq. on March 9, 2010

Have you finished digesting that data sir?

Spoliation simply can’t get much worse than this. Following his arrest outside of a bank in Queens, New York this January, Florin Necula apparently swallowed a 4 GB Kingston flash drive in an attempt to keep Secret Service agents from discovering the evidence. Facing a charge for the use of a “skimmer” to collect ATM and credit card numbers, Necula’s bizarre version of spoliation also earned him a charge of obstruction of justice.

According to court documents, while in custody at the US Secret Services offices Necula grabbed the flash drive “which had been on his person at the time of his arrest, and swallowed. Doctors from the New York Downtown Hospital later removed the flash drive because they were concerned that Necula would be injured if they allowed the flash drive to remain inside him.”[1]

This procedure to “safely eject” the flash drive from Necula’s operating system took place four days after it had been ingested. It seems that the suspect was proving, shall we say, retentive.

Was your account information lodged in this man’s descending colon? I, for one, wouldn’t care to be the forensics expert to find out.

[1] Affidavit in support of search warrants, issued to the United States District Court, Eastern District of New York.

Comments

Post A Comment

Logik — LFP…WTF?

LFP…WTF? Posted By Adam Reilly on March 4, 2010

Image Source: xkcd

In this post, we’ll build on the previous post’s technique of iterating through a file line-by line.  LFP files are an extremely common form of data interchange as document sets trade hands in litigation.  Their popularity is probably due in part to their simplicity.  As a review, LFP files are plain text files, where each record is a comma-delimited, newline terminated collection of five fields.  Find more details on the file format or fields here (http://platinumlit.wik.is/%28LFP%29_iPRO_Load_File).

Since the record structure is fairly simple and predictable, MS Access, Excel or SQL databases are popular choices for manipulating or exploring LFP files.  These tools are certainly appropriate for the job; however, it is possible to exceed the storage capacity of Excel and even Access in certain extreme cases.  At a minimum, each of these approaches requires a certain amount of overhead associated with importing the data.  Python can offer a dramatic speedup for large LFP files or tasks (QC, reporting, etc.) that need to be performed repeatedly.  We will work through a few such cases in the remainder of this post.

Sample Data

For the next several examples, we’ll use a small fictitious dataset comprised of the following records (if only they could all be this simple).  The set consists of ten single-page TIFF images taken from three documents.

IM,ABC00001,D,0,@DEF1022104;DEF10221042\0000;ABC00001.TIF;2

IM,ABC00002, ,0,@DEF1022104;DEF10221042\0000;ABC00002.TIF;2

IM,ABC00003, ,0,@DEF1022104;DEF10221042\0000;ABC00003.TIF;2

IM,ABC00004, ,0,@DEF1022104;DEF10221042\0000;ABC00004.TIF;2

IM,ABC00005, ,0,@DEF1022104;DEF10221042\0000;ABC00005.TIF;2

IM,ABC000006,D,0,@DEF1022104;DEF10221042\0000;ABC000006.TIF;2

IM,ABC000007, ,0,@DEF1022104;DEF10221042\0000;ABC000007.TIF;2

IM,ABC000008, ,0,@DEF1022104;DEF10221042\0000;ABC000008.TIF;2

IM,ABC00009,D,0,@DEF1022104;DEF10221042\0000;ABC00009.TIF;2

IM,ABC00010, ,0,@DEF1022104;DEF10221042\0000;ABC00010.TIF;2

Reporting: Document Statistics

Just as in the previous example, we’ll need to open the file for reading with the following line:

datFile = open("..\\testData\\sample.lfp",'r')

Then, we’ll use a ‘for’ loop to iterate through each line of the file:

for line in datFile:

Finally, we’ll perform some string manipulation to transform each record into its individual fields.  The rstrip method removes the newline from the end of each line, and split breaks a string into substrings, based on the supplied delimiter (a comma in this case).  This is similar to Excel’s “Text to columns” function.

fields = line.rstrip("\r\n").split(",")

If line contains “IM,ABC00001,D,0,@DEF1022104;DEF10221042\0000;ABC00001.TIF;2” the operations will proceed in the following steps, from left to right:

     
  1. rstrip   will remove the newline from the end of the file.
  2.  
  3. split(“,”) will identify all commas in the string and   build a list according to the delimiters
  4.  
  5. Finally, fields will be set equal to a list containing the   following fields, (Notice that field number start at 0):

                   
    Fields[0]Fields[1]Fields[2]Fields[3]Fields[4]
    IMABC00001D0@DEF1022104;DEF10221042\0000;ABC00001.TIF;2

 


With this basic construct, we can now begin to add code to discover and track features from the data.  Many features can be tracked simultaneously.  For instance, it’s common to want to know how many pages and documents are represented by a particular LFP file.  Page count for this data can be captured by initializing a counter variable outside of the loop and incrementing it with each line.  Similarly, document count can be obtained by incrementing a counter every time a non-empty value in the third field is encountered.

numDocs = 0

numPages = 0

for line in datFile:

  fields = line.rstrip("\r\n").split(",")

  numPages += 1

  if(fields[2] != “ ”):

  numDocs+=1

When the loop is finished iterating, numDocs and numPages should contain the appropriate values.

QC: Finding Abnormalities

If you look at the data, you will notice that there is one document which is seemingly named with a different convention than the others.  Files starting with ABC000006 through ABC000008 are zero-padded to six places instead of five.  This can be easily detected and fixed with Python.

We’ll start out by assuming that all Bates numbers in this production should have uniform prefixes and padding length.  If that’s the case, then every bates number in the file should be the same length, and adding code to detect otherwise is a simple matter, using Python’s built-in len() function.

if(len(fields[1]) > 8):

  nonConformNum = fields[1]

       

  print(str(numPages) + ": " + nonConformNum)

This code checks for any Bates numbers that are comprised of more than eight characters (three characters of prefix plus 5 of padding).  If any are encountered, the script will print the current value of numPages (which will be equivalent to the line number at any step in the loop) and the non-conforming Bates number.  This is helpful, because it alerts us to the presence of non-conforming values and provides line numbers or values to aid the search.  From this point, it’s only a little extra work to add code which fixes the problem.

Writing Files: Outputting the Fix

We’ve already determined a way to find non-conforming lines and established that the errors are ‘cosmetic’ and can be safely fixed without any further investigation.  Since the piece of the string that we want to modify is in the middle, we can’t use simple functions such as left and right truncation available in programs like Excel and Access.  We’ll need to take advantage of Python’s advanced string subscripting operator, which provides a compact notation for extracting a piece of a string.  We’ve seen in prior examples that one element in a Python list is accessed by placing a number with [] brackets.  Python also allows use of a range to return a sub list.  For example, we could isolate the prefix (the first three characters in a list of nine characters) by specifying the range 0:3.

batesPrefix = nonConformNum[0:3]  #Will store ‘ABC’

We can use this same principle to capture the numerical portion of the Bates number and apply some extra commands to format it correctly at the same time.

batesNumber = nonConformNum[4:].lstrip("0").zfill(5)

A before, the compound statement to the right of the equal sign starts at the left and works to the right, one method at a time.  It does the following three things inline:

               
CommandDescriptionData
nonConformNum[4:].Select all characters in the nonConformNum   string, starting with the fourth Character ABC000006
lstrip("0").Strip all 0’s from the beginning   of the resulting string6
zfill(5)Add the correct zero padding (five   digits total) to the stripped-down version of the string00006


When all three statements have completed, batesNumber now contains the correctly padded numerical portion of the Bates number.  These commands could be broken into multiple lines, but it is slightly more compact to represent them as a compound statement on one line, and we don’t need to save any of the intermediate results. 

All that’s left is to add code to handle the output of our corrected data.  Assuming we’ll want to capture results in a new file, we’ll use a slight variation on the open statement which we’ve been using to open source data.  This will need to be specified before the loop.

correctedFile = open(‘..\\testData\\sample_corrected.lfp’,'w')

This is almost identical to previous uses of the open command, with the exception of the ‘w’ parameter that is passed to the function.  This tells Python that the file should be opened for Writing. If the file does not already exist at this location, Python will create it and open it as a blank file.  If not, its contents will be deleted and it will be opened as a blank file.  (Note: be *very* careful when opening files for writing in Python, as any pre-existing data will be LOST).  correctedFile will now be available for writing within the loop.

Before presenting the full code, we’ll present the join method, which save a lot of typing if you’re outputting simple delimited records, like those found in an LFP.  The syntax might look a little strange if you’re new to object-oriented programming, but it’s intuitive as long as you remember what you’re trying to accomplish. 

",".join(fields)

Join takes a list as its argument and flattens it by gluing each item together, using the string in double quotes between the fields.  I takes data which once resided in compartmentalized and separate cells and flattens it into one string, with a marker to delineate the old boundaries.  This is not unlike saving an Excel file to a CSV.

Putting it all together

Here’s the full working code:

if __name__ == "__main__":

  # Open the LFP file for reading

  lfpFile = open("..\\testData\\sample.lfp",'r')

  # Initialize counters outside of the line-by-line

  # iteration,  These variables will keep track of

  # LFP features as the program steps through each line

  # of the file

  numDocs = 0

  numPages = 0

  # Variables to track QC steps while stepping through

  # the file

  nonConformingBates=0

  correctedFile = open("..\\testData\\sample_corrected.lfp",'w')

  print("Incorrect lines:")

  print("================")

  # Use a for loop to step though each line of the file

  for line in lfpFile:

      # this line applies two functions to the line

      # variable in order to normalize it for the remaiing

      # steps.  Method calls start inside, and work out left to

      # right in order.

      #  1) rstrip -> removes the newline character from each line

      #  2) split -> scans the string for supplied delimiter and

      #          breaks it into substrings as it finds them

      fields = line.rstrip("\r\n").split(",")

      # Each line in this file corresponds to one page in the set

      numPages += 1

      # Non-empty field 2 means the start of a new document

      if(fields[2] != " "):

        numDocs+=1

      # QC check to detect Bates numbers that are longer than 7 c

      # characters

      if(len(fields[1]) > 8):

        # Assign the incorrect Bates number to a string

        nonConformNum = fields[1]

       

        # Print the line number and bad number for reporting

        print(str(numPages) + ": " + nonConformNum)

        # isolate the bates prefix by selecting the first three

        # characters of the sequence

        batesPrefix = nonConformNum[0:3]

        # pull out the numerical portion of the beg Bates number

        # and format it with the correct number of zeros

        batesNumber = nonConformNum[4:].lstrip("0").zfill(5)

        # Overwrite the incorrectly padded number in the field

        # list

        fields[1] = batesPrefix + batesNumber

        # use the join method to merge all fields together with commas

        correctedFile.write(",".join(fields) + "\n")

        #back to the top of the loop

      else:

        # this case will be reached if the beg bates has the correct

        # number of characters, thus no procesing is necessary

        # it can simply be copied over to the new file

        correctedFile.write(line)

        #back to the top of the loop

  # Display the final values of the variables

  print()

  print("Summary:")

  print("========")

  print("Number of Documents: " + str(numDocs) + ", Number of Pages: " + str(numPages))

Running this code with the sample input yields the following results:

Incorrect lines:

================

6: ABC000006

7: ABC000007

8: ABC000008

Summary:

========

Number of Documents: 3, Number of Pages: 10

 

Comments

Post A Comment

Logik — How to check a file for duplicate lines - part 2

How to check a file for duplicate lines - part 2 Posted By Adam Reilly on February 15, 2010

This will just be a quick update to the last post.  In the previous version of the duplicate record detector the input file is specified statically (or “Hard Coded”) inside the file.  This means that the source code must be modified each time that users want to run analysis on a new load file. 

Unlike compiled languages like C++ or Java, Python doesn’t have a lengthy build cycle associated with making changes.  While this isn’t too inconvenient, your users might not be comfortable directly modifying source code and there’s also the potential to introduce bugs by changing the wrong line.  Fortunately, Python provides a method for passing data to a program via the command line.

The System Module

Python has a built-in module for interacting with the file system called system.  System is full of useful methods, but we’ll just be using the argument passing functionality. Libraries are imported into Python scripts via the import statement.

Here’s the full working code:

import system

import hashlib

import collections
# Defines a function that takes a string as its argument and returns the

#  hexadecimal representation of its MD5 checksum

#  In: A string

#  Out: A string of hex characters corresponding to the checksum

def calculate_md5(inStr):

  #create an instance of the md5 object from

  #python’s hashlib

  md5Obj = hashlib.md5()
  #Convert the string to a series of raw bytes

  #assuming that it’s UTF-8 encoded

  md5Obj.update(bytes(inStr,‘utf8’))
  #Render the object as a hex encoded md5 hash value

  return md5Obj.hexdigest()
if __name__ == “__main__”:
  #Default factory method which creates an empty

  #dictionary of lists

  lineDict = collections.defaultdict(list)

 

  #keep a counter variable to track which line

  #of the file we’re on

  i=1
  #Create an iterable file object

  # use the system library to pull arguments in from the command line

  datFile = open(system.argv[1], ‘r’)

   

  #cycle through each line of the file

  for line in datFile:

    #Calculate the checksum of the record

      lineHash = calculate_md5(line)

      #Either create a new entry in the dictionary

      #or append to the list of lines with the same

      #check sum

      lineDict[lineHash].append(i)

   

      #Advance the counter to move to the next line

      i+=1

   

  # Finally, some code to print out the results

 

  #Print a title

  print(“Duplicate Lines”)

  #Cycle through each slot or ‘key’ in the dictionary

  for entry in lineDict:

      #If the length of the list is 2 or greater

      #print it out

      if len(lineDict[entry]) > 1:

        print(lineDict[entry])

A few notes on DOS

This code can be run from the DOS prompt with the following command:

C:\> python findDupLines.py “C:\Path to\theFile.txt”

There are a few important points to keep in mind when running Python scripts from the command prompt.  The most important is that Python can find your script.  The easiest way to ensure this is to change directories to the location where your script resides.  In the example above, the findDupLines.py script would have to be located at the root of my C: drive.  Also notice the double quotes (“) surrounding the C:\Path to\theFile.txt.  These are necessary because of the space characters in the path.  If any folders in your path contain spaces, double quotes are mandatory.

Comments

Post A Comment

Logik — The Ten Step Rain-Dance

The Ten Step Rain-Dance Posted By Daniel Kaiser, Esq. on February 11, 2010

In the legal arena, regardless of how long you’ve been in the game, it always comes down to “making it rain.”

Myra L McKenzie, assistant general counsel in the Wal-Mart Stores, Inc. legal department, offers the following ten tips to rain-making in her article In Order to Make Rain, You Have to Know How to Gather the Clouds: Tips for Young Lawyers on Client Development, printed in the American Bar Association Young Lawyers Division 101 Practice Series.


  1. 1. Do good work and always add value. [Produce a work product that is both timely and of high quality.]
  2. 2. Find out if you have a client development budget and use it. [A clear record of how this budget is used may lead to a larger budget.]
  3. 3. Be strategic. [Create and present a client development plan.]
  4. 4. Perfect your professional presentation. [How does your web biography look?]
  5. 5. Research you potential clients and their needs.
  6. 6. Carry business cards and use them.
  7. 7. Speak and speak often. [Build a reputation for competence in certain issues.]
  8. 8. Get active in the bar. [This can increase your visibility.]
  9. 9. Attend events frequented by in-house lawyers. [These are “face time” opportunities.]
  10. 10. Learn to “pitch.” [Practice closing the deal!]

The full text of McKenzie’s article can be found reprinted in The ABA’s periodical the Young Lawyer, Vol.14, No.5, p.6 (Feb.-Mar. 2010).

Illustration from Holamun2

Comments

Post A Comment

Logik — How to check a file for duplicate lines

How to check a file for duplicate lines Posted By Adam Reilly on February 10, 2010

In this edition of “eDiscovery-related Python Tricks,” we’ll cover some fundamental techniques and operations that you’ll likely find yourself using repeatedly.  Suppose you’ve been given the task of merging load files from several productions together. 

You’re fairly sure that merging several files together has left the load file with duplicative lines, but the file is large and this would be difficult to determine manually.  While this example may seem a little contrived, it will provide a simple setup for laying foundation that will likely be re-used when we get to more interesting examples.

Opening files

The first thing we’ll need to do is tell Python where it can find the file it will be reading.  This is accomplished with the open function.  In Python open accepts a path and a few arguments in order to return a file object which the program can manipulate.  Don’t worry too much if this terminology doesn’t make sense, you’ll get a feel for it.  Assuming that the file resides at C:\loadFile\bigLoadFile.dat, we’d write the following code to open the file for reading:

datFile = open(‘C:\\loadFile\\bigLoadFile.dat’, ‘r’)

This line searches the specified path for a file with the given name.  You’ll notice that the path slashes are doubled (i.e., C:\\ instead of C:\).  The “\” character is special in Python, and is used to designate non-printable characters.  A double slash “\\” is Python’s way of denoting a folder separator, so that this code will run in a Windows environment.  The second argument ‘r’ tells Python that this file will be open *only* for reading.

Iterating line-by-line

Since we’ve determined that we’re interested in finding duplicative lines, we’ll need a way to access each line within the file.  Python provides a convenient method for doing this by making its file object “iterable”.  This is the computer science-y way of saying that file objects have a next() method that will pull the next line out of a file until the end of the file is reached.  This can be wrapped into a construct called a “for loop” which will cause a Python program to execute a block of code until a certain condition is reached.  Code to access a file line by line would look like this:

for line in datFile:

“for” is a special word within Python programs which marks the beginning of the loop.  “line” is an arbitrarily-chosen variable name which will hold the result of pulling successive lines out of datFile.

Dictionaries and MD5 Calculation

Now that we’ve set up the basic structure for looping through the entire file, it’s time to give some thought to the duplicate detection strategy.  Since there is no limit to the number of records a load file can contain, nor any size limit on rows, we’ll want to come up with an efficient way to capture and represent the data.  Additionally, use Python’s dictionary data structure to keep track of lines that we’ve seen so far.

lineDict = collections.defaultdict(list)

This line takes advantage of the Python collections library to create a dictionary which stores a list in each of its open slots.  We will calculate an MD5 value for each record in the DAT file.  Any identical MD5 values will mean that lines are exactly identical, so keeping track of line numbers and updating the dictionary accordingly should give a summary of identical lines.  The only missing piece is a way to calculate an MD5 hash value, given a string.  The following lines will be used to accomplish this:

def calculate_md5(inStr):

  md5Obj = hashlib.md5()

  md5Obj.update(bytes(inStr,‘utf8’))

  return md5Obj.hexdigest()

Briefly, we’re taking advantage of another Python library to generate the hash values and returning them as a string.

Putting it all together

Here’s the full working code:

import hashlib

import collections

# Defines a function that takes a string as its argument and returns the

#  hexadecimal representation of its MD5 checksum

#  In: A string

#  Out: A string of hex characters corresponding to the checksum

def calculate_md5(inStr):

  #create an instance of the md5 object from

  #python’s hashlib

  md5Obj = hashlib.md5()

  #Convert the string to a series of raw bytes

  #assuming that it’s UTF-8 encoded

  md5Obj.update(bytes(inStr,‘utf8’))

  #Render the object as a hex encoded md5 hash value

  return md5Obj.hexdigest()

# Main Program starts HERE!!!!!

if __name__ == “__main__”:

  #Default factory method which creates an empty

  #dictionary of lists

  lineDict = collections.defaultdict(list)

 

  #keep a counter variable to track which line

  #of the file we’re on

  i=1

  #Create an iterable file object

  datFile = open(‘C\\loadFile\\bigLoadFile.dat’, ‘r’)

 

  #cycle through each line of the file

  for line in datFile:

    #Calculate the checksum of the record

      lineHash = calculate_md5(line)

      #Either create a new entry in the dictionary

      #or append to the list of lines with the same

      #check sum

      lineDict[lineHash].append(i)

   

      #Advance the counter to move to the next line

      i+=1

   

  # Finally, some code to print out the results

 

  #Print a title

  print(“Duplicate Lines”)

  #Cycle through each slot or ‘key’ in the dictionary

  for entry in lineDict:

      #If the length of the list is 2 or greater

      #print it out

      if len(lineDict[entry]) > 1:

        print(lineDict[entry])

Running this code with sample input:

  1. fee
  2. foo
  3. foo
  4. fee
  5. few
  6. few
  7. few
  8. few
  9. fine
  10. pine

Yields the following output:

Duplicate Lines

[5, 6, 7, 8]

[2, 3]

[1, 4]

As you can see, the script correctly captures and summarizes duplicate lines within the file.  Notice that number ten does not appear in the sample, because it is unique.  Despite the fact that the sample input is small, the approach should scale up to very large files.  Using the MD5 value instead of the actually string allows the algorithm to store a small, fixed amount of data per record regardless of its size, so it’s unlikely that you’ll hit memory limitations.

Again, even if you don’t ever anticipate having to perform this specific task, it provides the shell for tasks that have to be performed repeatedly.  I hope that you’ve found this useful, and stay tuned for more eDiscovery-related Python Tricks.

Comments

Post A Comment

Logik — Cloud Computing - the Winners and the Challenged

Cloud Computing - the Winners and the Challenged Posted By Daniel Kaiser, Esq. on February 2, 2010

Do you hear that noise? No? That noise you may or may not hear is the sound of a quiet revolution well underway. The cloud computing revolution is bringing along with it a gradual but dramatic wave of change to the world of network infrastructure, IT servicing, and business models. This transformation speaks to different parties in different ways with the promise of efficiencies and cost containment on the one hand weighing in against security and hidden cost worries on the other hand.

By way of a definition, Gartner Inc., the Connecticut-based IT research and advisory company offers the following five attributes of cloud computing: 1) Service-based, 2) Scalable and Elastic, 3) Shared, 4) Metered by Use, and 5) Uses Internet Technologies.

How does cloud computing affect you? Now is the time to find out and to plan your next move. Some factors to consider:

Automatic Winners:

  • Data centers and cloud vendors. With the bulk of the migration to the cloud yet to come, it’s little wonder that Amazon, Microsoft, Apple, IBM and a host of others want to be in the game now. Houston, we are prepared for lift off.
  • Merger & acquisition attorneys. Sure, there are plenty of small-to-medium sized cloud vendors to choose from. Yet the expected industry trajectory says get ready for a whole lot of M&A consolidation leading to cloud Goliaths.
  • Early adopters of cloud technology. If you’re a start-up or a company anticipating significant IT infrastructure investments, an early migration to the cloud will allow you to bypass costly up-front hardware and IT fees.
  • Efficiency hawks. An IBM white paper reports an approximate 5% utilization rate of commodity servers on average. Yes, IBM is ramping up its efforts to capture the cloud computing market, but when you consider the money lost in operational costs, server maintenance and management, you have to admit that this makes a compelling argument.
  • Fans of Mobility. Do you like the idea of being able to access your software, applications and data remotely? No need to travel with your data or license multiple copies of software for home computing. Do you have a web browser? Just log in.. From the moon!
  • Data security consultants. Just because your data is “out there” doesn’t mean you’re willing to share it with greedy, prying eyes. Are you confused by all the options? Data security consultants certainly hope so.
  • Encrypted networks are bound for growth.
  • Compliance officers and attorneys. We’re back to the issue of data security. Getting it right and making sure that it stays that way just got more complicated. This will require a professional’s attention. Due diligence should be performed down the chain of providers, and contractual arrangements touching on data handling need to be in place between multiple parties.

The Challenged:

  • Security (and Certainty). Where once you could sign a client’s contracts assuring a certain measure of data security and confidentiality with ease, now you are at the mercy of an outsourced support structure. Not only will you have to contend with your cloud provider’s handling of your client’s data, but you’ll also have to contend with your cloud provider’s outsourced support. Issues related to where data is located, how data is managed, and who could potentially access your data remain some of the greatest challenges to a wholesale migration to the cloud. How many borders is your data being transferred across? What are the legal implications?
  • Data centers and cloud vendors. In the face of all of the cloud’s shiny promises, the key players have their work cut out for them to assuage the public’s concerns regarding data security and platform integrity.
  • Infrastructure vendors will need to increasingly target the cloud vendors.
  • Software vendors will need to contend with, or become, SaaS companies.
  • Large corporations? An oft-cited McKinsey & Company report, “Clearing the Air on Cloud Computing” claims that large corporations could actually lose by adopting cloud computing. McKinsey claims that the efficiencies so attractive to small and medium-sized clients will pale when applied to a large corporation. The report points out that a combination of in-house virtual computing together with tax write-offs involving equipment depreciation offer greater savings. Read it yourself, but bear in mind that service offerings continue to change dynamically. Costs and savings are not static.
  • In-house IT support staff. Exactly how many of these lost jobs will be replaced by new opportunities at remote datacenters? With efficiencies of scale, not enough. Especially when a significant number of those datacenters will be located off-shore.

Illustration by The Economist

Comments

Post A Comment

Logik — We’re Hiring, Come Join Us!

We’re Hiring, Come Join Us! Posted By Andy Wilson on January 26, 2010

Logik is growing and we are looking for more talented, smart, and amazing people to come join our small company. Learn more about the company here.

Here are the open positions:

Comments

Post A Comment

Logik — “Data! Data! Data!” — a Posse List interview with Andy Wilson of Logik

“Data! Data! Data!” — a Posse List interview with Andy Wilson of Logik Posted By Adam Reilly on January 25, 2010

This interview is part of the Posse List’s series “Data! Data! Data!” — Cures for a General Counsel’s ESI Nightmares”.  For an introduction to the series click here.

Logik was humbled to be the first company interviewed. Below is a copy of that interview:

Start Interview

Logik is one of the more extraordinary companies to come onto the e-discovery scene.  A dynamic company, they derive their name “the formal systematic study of the principles of valid inference and correct reasoning, and; the interrelation or sequence of facts or events when seen as inevitable or predictable.” Or, as in today’s parlance: they’re a lean, mean e-discovery processing machine”.

Located in Washington, D.C., in beautiful loft space across from the National Portrait Gallery, the company counts AM Law 100 law firms and Fortune 500 corporations as its client base.

We had the opportunity to spend a few hours at Logik’s corporate headquarters talking with Andy Wilson.

TPL: Both you and your co-founder, Sheng Yang, were in school together at Virginia Tech, but you didn’t really know each other until after school, correct?

AW: Correct. I actually started college as an English major but moved to Computer Science, graduating with a degree in Business Information Technology in 2001. Sheng and I did meet briefly at Virginia Tech.  We both worked at a web design company for a few months, but we didn’t really know each other.

TPL: But straight after school you went home to Kentucky and tried “the entrepreneurial thing” and did independent web design.

AW: Yep. After a year in Kentucky, I realized I wasn’t really in the right place to launch my web career, so I headed to D.C. (proceeds from my tax refund check happily in hand) and began to interview with a multitude of government agencies. The government was in a hiring frenzy with respect to tech/web people. Thing was, as I cruised the various floors of endless cubicles (think “Dilbert” here), I said to myself, “Andy, do you really want to do this?” and the answer was a clear “No.”

TPL: And an opportunity popped up at Driven?

AW: That’s right. I was hired at Driven as a “techie,” which I did for several months before moving into sales. At the time, Driven specialized in various aspects of litigation support including digital reproduction, paper discovery, scanning, copying, printing and graphic design. It was not “e-discovery” per se, but it was fascinating because we sold the accounts and the projects, and we actually did the scanning, printing, blowbacks, etc.

It was incredibly long hours, but I was in my early twenties and didn’t mind. I was able to bring in a few top AM Law 100 law firms in my first year, generating about $1 million in sales over my first 12 months. And it was at Driven then I reconnected with Sheng.

TPL: And an idea was formed ….

AW: Yep! I wanted to go into application development, to create an easy and scalable software platform to do eDiscovery processing. And I found in Sheng a kindred spirit. But, at the time, Driven did not want to become a software company, so we parted ways and Sheng and I went off to start an eDiscovery processing company.

TPL: With an idea … but no clients.

AW: Yep, no clients – we had nothing to sell yet! (laughs) Sheng and I spent about a year writing code and our business plan, living off our credit cards, my wife’s salary… and working from my dining room. In the spirit of Steve Jobs, Apple and his garage!!

TPL: And then enter … Superior Glacier, your first client.

AW: Superior Glacier is an end-to-end litigation support provider focused on marketing in New York, Chicago and Washington, D.C. They first came to us through a friend of a friend. We were expecting the usual “tell us about Logik” – you know, a simple introductory meeting. Which we did on my couch. But then they whip out this DVD and say “Well, we are having issues with this data, what can you do with it?” So, we loaded it up onto our server (and quite frankly we were a little apprehensive thinking “Man, I hope this works!”) and we ran it through the software we developed: Gridlogik™.

TPL: And, of course, Superior Glacier was more or less expecting your response to be “We’ll get back to you in a few days…”

AW:  Exactly. Except our software went through its paces and produced the results they were looking for in about 30 minutes. All the files were accurately processed, converted, numbered and exported with ready to import load files.

TPL: And you blew them away.

AW: Pretty much, yes. I don’t think they were expecting two guys in a dining room to solve their eDiscovery problem, but we did.

They had been working on this data set for almost two weeks with no results. We pointed out the problems they were having and how our software identified and fixed them. After they thoroughly rummaged through the output to confirm the results, we got to talking about pricing. They were used to the industry standard, which at the time was to charge based on the number of gigabytes the data extracted to, post processing. (Note to readers: Data extraction is the process of breaking down structured and unstructured data into individual records or documents. For instance, saving attachments from emails as their own documents or extracting files from .zip files is considered data extraction. This process is time-consuming and can result in the original data set exploding in size, often doubling and sometimes tripling from the original size.)

We had then, and still do, a very simple pricing model. We built our technology in a way that allows us to price eDiscovery on the original, non-extracted data size. We engineered a data-mapping algorithm that quickly identifies all documents in a data set without actually extracting it. Basically, it’s our secret sauce. So, this attractive pricing model coupled with a new technology that was very capable got Superior Glacier thinking. They sent us our first “real” project the following week.

TPL: And they paid you $40,000?

AW:  Just about, yes. Sheng and I were singing and dancing. We thought “wow, we really are onto something here!” All of our start-up costs were covered and we had enough money left over to buy some more servers.

TPL: Ah, yes, the giddy feeling from the first paying client.

AW: In fact, they flew us up to New York to discuss a potential acquisition. Then we KNEW we had something! However, we took the road less travelled, so to speak, and decided to keep Logik between Sheng and I. Superior Glacier is still a client today and we help them on a few legacy cases.

TPL: I imagine you have had a number of companies who want to license your technology (which we think can usually end badly, resulting in “brand smashing,” among other issues).

AW:  There have been a few companies interested in licensing Gridlogik, but we have always turned them down. Right now we are a business-to-business services company. If we licensed our software, we lose control over technology and become only a software company. If someone else used Gridlogik and made mistakes, that would negatively affect our reputation and our brand. We always want to do it right.

TPL: So, we have a very happy Superior Glacier. And then Fried Frank comes on board.

AW: The DC litigation manager at Fried Frank had some complex processing problems that involved unified communications, specifically Bloomberg data. (Note to readers:  for some background information on unified communications click here).

While analog is somewhat easy to analyze and parse, unified communication offers one enormous text file. Meaning, you need to know how the software created the file and requiring you to break out the metadata, and so on. It’s much more complex. In the case of Fried Frank’s client, they had about 20 gigabytes of this stuff that needed to be reviewed. And of course, we were eager to do whatever we could to help our new client. So, we modified Gridlogik to quickly parse and piece all the data together into a reviewable format similar to what they were getting with Outlook email reviews. We finished the project in less than two days and the client was very, very happy.

TPL: And this led to Fried Frank referring Williams & Connolly, who referred you to Finnegan and Henderson, etc., etc.

AW: Exactly. We have done very little direct marketing. Almost all of our business, both for law firms and direct corporate clients, has come from referrals. Granted, it’s a somewhat slower customer acquisition process, but we find it beats cold-calling any day, and we’re fortunate to have very loyal, happy clients.

TPL: Tell us a bit about the first work for Finnegan — without mentioning the actual corporate client. You know, confidentiality and all that.

AW: Well, Finnegan had been working with a vendor (client’s choice) who had totally screwed up the data processing on a high-profile matter. The vendor had worked for months on it. There were missed deadlines, incorrect deliverables and poor communication throughout. Obviously, this would frustrate anyone, so Finnegan decided to look elsewhere for help. Logik was recommended to Finnegan by one of our clients and we ended up winning their confidence after processing some sample data. In under two months we re-processed all the data, matched up the already coded documents, and re-produced the data in a much cleaner and consistent manner. And with that, Finnegan become a happy client of ours, too.

TPL: You do a pretty large amount of work for a major top 3 accounting firm, yes?

AW: We do, yes. That work is mainly for rapid “banking-related” document productions to the government. They also work with us on more complicated Lotus Notes projects that they would rather outsource to us. It’s a great working relationship that we value highly.

TPL: Tell us a little bit about your work with predictive coding, that is, the capability to use a small set of (partially) coded documents to predict document coding over the complete corpus. I believe Recommind has done a lot of work in this area.

AW: Sure. Predictive coding is going to be big in the next few years. It makes sense, considering the volume of data lawyers have to review in a finite time frame. To get our feet wet in this space, we participated in the 2009 TREC legal study. It was fascinating and quite challenging, but we learned many useful methodologies to help our clients use advanced machine learning techniques to apply predictive coding to their documents. Like everyone, we are new to this area, but we are putting more resources into it in the coming year. We’re pretty excited about it.

TPL: As we discussed, the big “new new” thing all of last year — at every event we covered — was early case assessment and winnowing relevant data down to reduce the number of documents to review.  As the stats bear out, it is the most expensive part of the process.  But now we have predictive coding, plus the work being done in computer assisted review as evidenced by Patrick Oot and Anne Kershaw’s study “Document Categorization in Legal Electronic Discovery: Computer Classification vs. Manual Review “, plus the work being done by Google and Microsoft on auto-categorization or auto-coding.  Are we headed down the path to where machines can be statistically proven to be as accurate as human review?  Is the technology getting to the point where we can also winnow out the eyeballs — contract attorney reviewers?

AW: There’s too much value in humans to take us out of the equation. Technology is just a means to an end, and I don’t think we will see document review sans humans in our lifetime. I do see document review teams getting smaller, focused and more tech-savvy, like your “special ops” reference. With the right set of tools, a small number of tech-savvy attorneys can rip through massive amounts of data in a very short amount of time. The days of expensive linear review are numbered.

TPL: But the legal industry is… to put it mildly … risk adverse. Despite the lamentations of Richard Susskind and Jordon Furlong that the law profession needs to understand the tectonic changes that are occurring, and that change will only come slowly.

AW: “Risk adverse” is definitely putting it mildly, but smaller, younger and more agile firms are starting to sprout up, willing to take on traditional practices with new billing structures and a willingness to use technology in the best interest of the one paying the bills: the client. This follows a similar path as Logik – a small, fast-paced and lean company that can deliver results in a new way using new tools and methods. Things change, and to stay prime you have to stay on the wave of that change.

TPL: In Malcolm Gladwell’s book Outliers, he mentions Skadden, Arps as an example of a firm that has taken an opportune period of time, and some cultural advantages, to give it an edge. He also talks about the “10,000 Hour Rule.” Sort of reminds us of Logik, yes?

AW: I think so. The 10,000 hour rule is an interesting concept. Basically, it comes down to timing, perseverance and practice, practice, practice. We were fortunate to start our company at the right time, just when eDiscovery was starting to get hot in 2004. We persevered through very tough times trying to validate our market and our existence being a niche processing player. And we got a lot of practice. In the first few years, we focused just on eDiscovery processing, exposing us to many unique situations that our competitors haven’t even come across yet. Unfortunately, it meant I saw my wife for maybe only two hours a day for a year. But you do what it takes to succeed. It was a lot of hard work, but also very fun.

TPL: You now have a multimillion dollar business, all done with 14 employees. But you are expanding. Given the nature of your operation, I imagine you need to consider “culture matching” to a great degree.

AW: We do. We have been looking to add an additional eDiscovery project manager … and, in fact, we advertised via The Posse List and got a great response, thank you! Fitting into the culture is super important to us. Logik isn’t for everyone, so it’s tough to find the right person. And we don’t compromise based on someone’s resume. We encourage people to check the website at logik.com for openings.

TPL: Ok, now, a key question: your Logikbot mascot, symbol [shown above] … just what is that thing?

AW: Ha! Logikbot. Well, he’s both, really. Logikbot is a metaphor for who we are; a small team of smart and motivated people offering great technology and service, taking on the established big vendors, we call them BV2000’s. Unlike many companies in this space, we embrace the fact we are small and nimble. It’s a big advantage for us, so there is no reason to act all big and mighty. Our clients and our work speak for itself. Logikbot is akin to the main character in Rudy… who doesn’t want to root for the little guy?

TPL: And this new site you have created called eDDstuff.com. That’s charity-driven, yes?

AW: Absolutely. We’re really happy that eDDstuff.com is a fun, charity-driven destination. We figured the eDiscovery industry needed funny and witty t-shirts, so we created about nine different designs from “eDiscovery ninja” to “eDiscovery nerd.” Ten percent of every purchase goes to a local DC charity, with the rest going to the vendor who makes the actual products. It’s been a huge success since we launched it and the orders have already started coming in.

TPL: Andy, it was a pleasure chatting with you. We appreciate the time you’ve taken.

AW: This has been great, let’s do it again soon. We have some very interesting things coming out in 2010 that we think your “special ops” team will really like.

End Interview

Comments

Post A Comment

Logik — Leaders Portfolio Chats With Logik’s CEO About eDiscovery and Noodles

Leaders Portfolio Chats With Logik’s CEO About eDiscovery and Noodles Posted By Adam Reilly on January 5, 2010

Leaders Portfolio, an online and radio distributed interview show (leadersportfolio.com), invited Logik’s CEO, Andy Wilson, to chat about how Logik started, what is eDiscovery, and various entrepreneurial experiences. The interview is about fifteen minutes long.

Check it out

Comments

Post A Comment

Logik — Preliminary Look at a Preliminary Report on Civil Rules

Preliminary Look at a Preliminary Report on Civil Rules Posted By Daniel Kaiser, Esq. on November 23, 2009

Emery G. Lee III and Thomas E. Willging of the Federal Judicial Center recently released their Preliminary Report to the Judicial Conference Advisory Committee on Civil Rules. You might be thinking to yourself, “Wow, a 191-page preliminary report.. on Civil Rules.. what’s in it for me?” A fair question, but actually this thing is pretty interesting.

To begin at the end, I was intrigued to find 77 pages of feedback from survey respondents classified according to the clients they represent: plaintiff’s attorneys, defendant attorneys and attorneys representing plaintiffs and defendants about equally. Predictably, voices from the plaintiff’s attorney sector are pointing out abuses of discovery perpetrated by defendant attorneys adversely affecting both the duration and cost of the process… while similar voices from across the isle are complaining of discovery abuse on the part of plaintiff attorneys. (Maybe these guys could get together and talk?)

Moving on to the survey’s actual findings, here are a few intriguing statistics all taken from attorney feedback following closed cases:

  • Figure 11 – By far, most attorneys (72.4% of plaintiff attorneys and 78.3% of defendant attorneys) reported no ESI disputes in their closed case.
  • Figure 13 – Both plaintiff and defendant attorneys reported, for the most part, that the information produced by party-generated discovery was “just the right amount.”
  • Figure 14 – A similar majority of both plaintiff and defendant attorneys found that their discovery costs were “just the right amount” when compared to their client’s stakes.
  • Figure 17 – Most attorneys, regardless of which side of the “v.” they hail from, agree that “the parties in the named case were able to reduce the cost and burden of the named case by cooperating in discovery.”
  • Figure 19 – When asked how discovery costs affected settlement, the most frequent response by far was “no effect.”
  • Table 10 – In cases involving discovery costs, plaintiff attorneys estimate those costs to be a median of 1.6% of their client’s estimated stakes. Defendant attorneys estimate a median of 3.3%. (Perhaps this gives credence to the defendant attorneys’ perspective, as mentioned above in survey respondent feedback.)
  • Figure 37 – Nearly all respondent attorneys agree that “attorneys can cooperate in discovery while still being zealous advocates for their clients.”

This is just a small sample of what the report has to offer. It’s worth taking a look.

Illustration by Carlos Castellanos © 2008

Comments

Post A Comment

Logik — Logik Launches eDiscovery Apparel Website eDDstuff.com

Logik Launches eDiscovery Apparel Website eDDstuff.com Posted By Andy Wilson on November 11, 2009

Fun apparel & merchandise with a serious charitable attitude; perfect for everyone

For Immediate Release
WASHINGTON DC – November 11, 2009 – Just in time for the upcoming holidays, Logik, a Washington, DC-based eDiscovery company, is proud to announce the launch of the world’s first and only eDiscovery apparel and merchandise website at www.eDDstuff.com. Visitors of all ages will discover fun, hip designs professionally printed on a variety of apparel and merchandise, including comfy t-shirts, warm hoodies, large coffee mugs and even downloadable versions of the fun eDiscovery images for both computer and iPhone wallpapers.

“When we told our friends, family and clients that we were launching an eDiscovery clothing and merchandise website,” laughs Andy Wilson, Co-Founder and CEO of Logik, “the one question we heard over and over was… ‘Why? Who does that?’ When we first came up with the idea, it just made sense to us. Logik is energetic and passionate about what we do. We wanted a different way of showing that.”

So, why would an eDiscovery company launch a website featuring whimsical designs on clothing? There’s a simple answer to that: it’s just one more example of how Logik thinks differently than other companies.

“Like us, people who work on eDiscovery projects are passionate about their industry, whether they work for the government, a law firm or a corporation,” says Sheng Yang, Logik Co-Founder and CTO. “The eDDstuff website allows us to enjoy our shared passion. Maybe it’s more like we get to show we’re all part of the same club.”

“You don’t have to be an ‘eDiscovery nerd,’ as one of the designs happily proclaims, to enjoy all the great things on eDDstuff. Buy a shirt for your mom, a hoodie for your honey and download the ‘eDiscovery Ninja’ for your iPhone. Wear your eDiscovery smarts with pride”

— Andy Wilson, CEO

In reality, eDDstuff.com is about more than eDiscovery stuff – it’s also about giving back. Ten percent of every purchase made on eDDstuff.com goes directly to Martha’s Table, a non-profit organization located in Washington, DC (the remainder of the purchase price goes to Zazzle.com who actually makes the products). Martha’s Table’s mission is to help at-risk children, youth, families and individuals in the Washington, DC community improve their lives by providing educational programs, food, clothing and enrichment opportunities.

“You don’t have to be an ‘eDiscovery nerd,’ as one of the designs happily proclaims, to enjoy all the great things on eDDstuff. Buy a shirt for your mom, a hoodie for your honey and download the ‘’ for your iPhone. Wear your eDiscovery smarts with pride,” says Andy.
Get a Shirt. Give a Bite. Visit
eDDstuff.com and pick up a mug or a hoodie to keep warm this winter, and give a little back.

About Logik:
Logik is an eDiscovery processing company located in Washington, DC. Number 181 on the 2009 Inc. 500, Logik helps corporations, law firms, government agencies and service providers simplify electronic data sought in discovery requests. Logik’s innovative and highly distributed processing platform, Gridlogik, was developed to process all kinds of unstructured and structured data sets such as email databases, spreadsheets, images and MS Office documents. Combined with their transparent pricing model, Logik offers customers the smart way to discover accurate results and make sense of processing costs. Find out more at logik.com. Media interested in setting up an interview with a representative from Logik should email .(JavaScript must be enabled to view this email address) or call 800-951-5507.
###

Comments

Post A Comment

Logik — Making a Federal Case of the Duty to Produce

Making a Federal Case of the Duty to Produce Posted By Daniel Kaiser, Esq. on November 2, 2009

In our last post we had a look at the duty to Preserve. Leaving that pickle behind, today we’re moving on to the Duty to Produce. Or, as the Federal Rules of Civil Procedure would term it, the Duty to Disclose.
From a federal context, the duty to disclose has been bundled up nice and tidily in Fed. R. Civ. P. 26. Rule 26 should be examined and addressed early when facing a potential lawsuit because, absent an exemption, some of the required disclosures must be made from the very outset – “without awaiting a discovery request” – including contact details for those who are likely to have discoverable information.
More interesting than Fed. R. Civ. P. 26(a)’s coverage of disclosure’s “whos, whats, whens and hows” are the following subsections and their coverage of Discovery’s Scope and Limits. Although discovery’s potential scope is broad[1], the limitations are numerous including:

  • [Upon proper showing,] A party need not provide discovery of [ESI] from sources that the party identifies as not reasonably accessible because of undue burden or cost . . . [that is, unless] the requesting party shows good cause . . . .[2]
  • The court must limit discovery’s frequency or extent if it finds requests to be unreasonably cumulative or duplicative, if another source would be less burdensome, if the requesting party had ample prior opportunity to obtain the information sought, and if burden or expense outweighs the requested information’s likely benefit.[3]
  • Both information that has been withheld from disclosure and information that has been produced may be subject to a claim of privilege or of protection as trial-preparation material.[4] (Refer to Fed. R. Evid. 502 for specific provisions related to privilege and work product.)

It’s important to note that these duties apply even to those who don’t have a pan on the fire. Fed. R. Civ. P. 34(c), citing Rule 45, points out that even “a nonparty may be compelled to produce documents . . . or to permit an inspection.”

Production format issues seem to have been hammered in the rule book and can be found repeated through Rules 26, 34 and 45. The Advisory Committee’s comment on this issue points out that:

  • The rule does not require a party to produce [ESI] in the form in which it is ordinarily maintained, as long as it is produced in a reasonably usable form. But the option to produce in a reasonably usable form does not mean that a responding party is free to convert [ESI] from the form in which it is ordinarily maintained to a different form that makes it more difficult or burdensome for the requesting party to use the information efficiently in the litigation. If the responding party ordinarily maintains the information it is producing in a way that makes it searchable by electronic means, the information should not be produced in a form that removes or significantly degrades this feature.[5]


[1] See Fed. R. Civ. P. 26(b)(1).
[2] Fed. R. Civ. P. 26(b)(2)(B).
[3] Fed. R. Civ. P. 26(b)(2)(C).
[4] Fed. R. Civ. P. 26(b)(5).
[5] Fed. R. Civ. P. 34, Advisory Committee’s Note to the 2006 Amendment.

Comments

Post A Comment

Logik — Happy Halloween!!

Happy Halloween!! Posted By Andy Wilson on October 30, 2009

Rawr!!!!

Comments

Post A Comment

Logik — Logik + Equinix = Speed n Security

Logik + Equinix = Speed n Security Posted By Andy Wilson on October 29, 2009

We are very excited to announce some big news at Logik.  Our processing power (all of our servers and your data) are now within our new data-center at Equinix.  If you drove through Chinatown on your way to work over the past few months you may have noticed some street construction between 7th and 9th streets. Sorry, that was us.  We were installing a secure high-speed fiber line into our Equinix data-center.

Who’s Equinix you ask?  The world’s leading global data center and interconnection provider (Nasdaq: EQIX).  Netflix, DoubleClick, Amazon, Google, and Adobe use Equinix to house their servers, and now Logik does too. www.equinix.com/company/customers

So, what does this mean for you?

Speed

The secure fiber line we had installed gives us gigabit ethernet connectivity to our servers, even though they are now in Ashburn Virginia.  We also increased our internet bandwidth by 10x and can now provide lightning-fast FTP uploads/downloads at speeds up to 100mbps.  Downloading 10+ gigabytes in just a few hours is now a reality (assuming you aren’t using dial up).

Security

When 80% of the east coast’s internet traffic flows through your data-center it’s safe to assume security at Equinix is top-notch.  That being said, here are some of the security highlights: N+1 power redundancy, precision HVAC temperature controls, smoke detection units, fire suppression systems, environmental control, biometric hand-geometry scanners, monitoring via CCTV and tightly controlled access.

Watch video

Comments

Post A Comment

Logik — Spoliation and the Duties you Do - Preservation vs. Production

Spoliation and the Duties you Do - Preservation vs. Production Posted By Daniel Kaiser, Esq. on October 16, 2009

If you don’t want to spoliate all over yourself, it’s best to know how to do your duties.

Judge Grimm’s comments on the not-quite-twin duties of Preservation and Production in Goodman v. Praxair Services, Inc. come in the form of an easily overlooked footnote[1], but this is a sidebar worth looking into. Judge Grimm points out that there is “an important difference between the duty to preserve and the duty to produce . . . .”[2] This blog, as the first of a two-part series, will take a closer look at the duty to preserve.

Preserve.

Preservation is the duty with which spoliation comes into play. In Zubulake v. UBS Warburg LLC, (Zubulake V)[3], Judge Scheindlin’s ruling stated that “[s]poliation is the destruction or significant alteration of evidence, or the failure to preserve property for another’s use as evidence in pending or reasonably foreseeable litigation.”[4]

The Sedona Conference Working Group on Electronic Document Retention & Production illustrates litigation holds by stating that “whenever litigation [or a regulatory investigation or proceeding] is reasonably anticipated, threatened or pending against an organization [or natural person], that organization has a duty to preserve relevant information. This duty arises at the point in time when litigation is reasonably anticipated whether the organization is the initiator or the target of litigation.”[5]

Judge Grimm made further reference to Zubulake IV in spelling out this duty: “Once a party reasonably anticipates litigation, it is obligated to suspend its routine document retention/destruction policy and implement a ‘litigation hold’ to ensure the preservation of relevant documents.”[6]

Potentially the best definition of “relevant documents” can also be found in Zubulake IV, including:

[A]ny documents or tangible things (as defined by [Fed.R.Civ.P. 34(a)]) made by individuals “likely to have discoverable information that the disclosing party may use to support its claims or defenses.” . . . [A]lso . . . documents prepared for those individuals, to the extent those documents can readily be identified (e.g., from the “to” field in e-mails) . . . [A]lso . . . information that is relevant to the claims or defenses of any part, or which is “relevant to the subject matter involved in the action.”[7]

You may well have a comfortable grasp of the litigation hold-based duties to preserve, but preservation duties extend beyond the first, basic step of issuing a litigation hold. In July we saw this point reiterated in Pinstripe, Inc. v. Manpower, Inc.,[8] a hearing on the motion for sanctions against Pinstripe for failure to preserve documents relevant to a court proceeding. Again referencing Zubulake, the U.S. Magistrate Judge held that

. . . a party’s issuance of a litigation hold does not end its responsibilities in discovery. The party must see that the litigation hold is complied with, “monitoring the party’s efforts to retain and produce the relevant documents.” . . . This necessarily involves communication with all of the “key players” in the litigation.[9]

Finally, the Federal Rules of Civil Procedure create the back-door Safe Harbor for electronic information system maintenance entitled Failure to provide Electronically Stored Information: “Absent exceptional circumstances, a court may not impose sanctions under these rules on a party for failing to provide electronically stored information lost as a result of the routine, good-faith operation of an electronic information system.” [10]

We’ll move on to the duty to produce in my next post.

[1] __ F.Supp.2d __, 2009 WL 1955805 at *17 n.13 (D.Md. July 7, 2009).
[2] Id.
[3] 2004 U.S. Dist. LEXIS 13574; 85 Empl Prac. Dec. (CCH) P41, 728.
[4] Id. at HN1.
[5] http://www.thesedonaconference.org/content/miscFiles/Legal_holds.pdf at 1.
[6] Goodman, FN 1, at *14 (quoting Thompson, 219 F.R.D. at 100, quoting Zubulake IV, 220 F.R.D. at 218).
[7] Zubulake IV, 220 F.R.D. at 217-18 (footnotes omitted).
[8] Pinstripe, Inc. v. Manpower, Inc., 2009 WL 2252131 (N.D.Okla.).
[9] Id. citing Zubulake v. UBS Warburg LLC, 229 F.R.D. 422, 432 (S.D.N.Y. 2004).
[10] Fed. R. Civ. P. 37(e).

Comments

Post A Comment

Logik — Pics from the Logik Open House and Inc 500 Party

Pics from the Logik Open House and Inc 500 Party Posted By Andy Wilson on October 2, 2009

We threw a party in our new office to celebrate our new home and our Inc. 500 award (#181).  In case you missed it, check out the pictures we posted.  We had a good turnout, about 100 people showed up and enjoyed catered food from Occasions, pool, fresh drinks, and of course some Mario Cart Racing on the Wii.  Everyone had a great time.  Check out the pics!

Comments

Post A Comment

Logik — Logik Offers NO Sales Tax to Clients

Logik Offers NO Sales Tax to Clients Posted By Andy Wilson on October 1, 2009

We are excited to announce that, as of today, Logik has qualified to be a High Technology Company in the District of Columbia. Ok, so what does that mean?

It means Logik no longer applies SALES TAX (now at 6% by the way) to any of our invoices. Yes, this is NOT an April Fools joke and is totally legal and legit. It’s kind of like a big 6% discount across the board for all of our valued clients (and future clients-wink). This results in HUGE savings for many of our clients as sales tax can really add up, especially for larger projects. Last year alone we tacked on $200,000 in sales tax.

We love discounts and we hope you do too. Please give us a call or write to us with any questions you may have about this change.

Comments

Post A Comment

Logik — A Grimm View on Spoliation

A Grimm View on Spoliation Posted By Daniel Kaiser, Esq. on September 11, 2009

Goodman v. Praxair

Would you be surprised to hear that Judge Paul Grimm, Chief United States Magistrate Judge for the U.S. District Court for the District of Maryland, holds the Parachutist Badge, the Meritorious Service Medal, the Army Commendation Medal and the Army Achievement Medal? I was. These just aren’t the usual images springing to mind when one thinks of the small handful of federal judges in the eDiscovery world who have been instrumental in getting eDiscovery’s rules of the game out there with clarity. Lawyers beware; you don’t want to be on his bad side.

So bombs away: True to form, Judge Grimm’s decision in Goodman v. Praxair Services, Inc.[1] packed a punch with its wealth of analysis and rules.

This was a case in which Goodman, a pro se litigant, filed suit for breach of contract based upon non-payment of a success fee. Marc Goodman was hired by the Tracer Research Corporation (“Tracer”) to help secure an Environmental Protection Agency exemption for Tracer’s products. Although Tracer succeeded in winning their desired exemptions, the company refused to pay Goodman the stated fee – stating that other third party consultants were solely responsible for obtaining the exemptions.

The opinion revolved around Goodman’s Motion for Spoliation Sanctions, filed pursuant to Tracer’s failure to institute a timely litigation hold and due to Tracer’s destruction of computers (and files) after the duty to hold had been triggered.

Here is a quick look at a few eDiscovery take-aways from Goodman.

The Two Main Federal Law Sources of a Court’s Authority to Levy Sanctions on a Spoliator. Judge Grimm points to the following:

(1) First, there is the “court’s inherent power to control the judicial process and litigation, a power that is necessary to redress conduct ‘which abuses the judicial process.’”[2]

(2) Second, if the spoliation violates a specific court order or disrupts the court’s discovery plan, sanctions also may be imposed under Fed.R.Civ.P. 37.[3]


The Required Elements for Spoliation Sanctions. The proponent must prove:

(1) [T]he party having control over the evidence had an obligation to preserve it when it was destroyed or altered;

(2) the destruction or loss was accompanied by a “culpable state of mind;” and

(3) the evidence that was destroyed or altered was “relevant” to the claims or defenses of the party that sought the discovery of the spoliated evidence, to the extent that a reasonable factfinder could conclude that the lost evidence would have supported the claims or defenses of the party that sought it.[4]


The Culpability Requirement. The mens rea element included in number two above may be satisfied by one of the following three states of mind:

(1) Bad faith / knowing destruction. “Bad faith” as used here means “destruction for the purpose of depriving the adversary of the evidence.”[5] “Knowing destruction” has been related to “willful,” and “[d]estruction is willful when it is deliberate or intentional . . .”[6]

(2) [G]ross negligence, and

(3) [O]rdinary negligence.[7]


Sanctions Involving Money. When it comes to motions for a reimbursement of discovery costs and attorneys’ fees, four situations give rise to such awards:

(1) First, courts will award legal fees in favor of the moving party as an alternative to dismissal or an adverse jury instruction.

(2) Second, courts will grant discovery costs to the moving party if additional discovery must be performed after a finding that evidence was spoliated.

(3) Third, in addition to a spoliation sanction, a court will award a prevailing litigant the litigant’s reasonable expenses incurred in making the motion, including attorney’s fees.

(4) Fourth, in addition to a spoliation sanction, a court will award a prevailing litigant the reasonable costs associated with the motion plus any investigatory costs into the spoliator’s conduct.[8]


The Timeliness of a Spoliation Motion. Although this element of a motion for discovery sanctions isn’t covered by Fed.R.Civ.P. 37, the following factors have been considered in judicial assessments:

(1) [K]ey to the discretionary timeliness assessment of lower courts is how long after the close of discovery the relevant spoliation motion has been made . . . .[9]

(2) [A] court should examine the temporal proximity between a spoliation motion and motions for summary judgment.[10]

(3) [C]ourts should be wary of any spoliation motion made on the eve of trial.[11]

(4) [C]ourts should consider whether there was any governing deadline for filing spoliation motions in the scheduling order issued pursuant to Fed.R.Civ.P. 16(b) or by local rule.[12] [and]

(5) [T]he explanation of the moving party as to why the motion was not filed earlier should be considered.[13]


In sum, Judge Grimm says that these motions should ideally be filed during the discovery phase in order to accommodate the court’s determination of “when the duty to preserve commenced, whether the party accused of spoliation properly complied with its preservation duty, the degree of culpability involved, the relevance of the lost evidence to the case, and the concomitant prejudice to the party that was deprived of access to the evidence because it was not preserved.”[14]


[1] __ F.Supp.2d __, 2009 WL 1955805 (D.Md. July 7, 2009).
[2] Id. at *9, citing United Med. Supply Co. v. United States, 77 Fed. Cl. 257, 263-64 (2007), and Chambers v. NASCO, Inc., 501 U.S. 32, 45-46, 111 S.Ct. 2123, 115 L.Ed.2d 27 (1991).
[3] Id. citing United Med. Supply Co. v. United States, 77 Fed. Cl. 257 at 264 (2007).
[4] Id. at *12 (citing Thompson, 219 F.R.D. at 101, and Zubulake v. UBS Warburg, LLC, 220 F.R.D. 212, 220 (S.D.N.Y.2003)).
[5] Id. at *19, citing Poell v. Town of Sharpsburg, 591 F.Supp.2d 814, 820 (E.D.N.C. 2008).
[6] Id.
[7] Id. at *18.
[8] Id. at *22.
[9] Id. at *10, citing McEachron v. Glans, No. 98-CV-17 (LEK?DRH) 1999 WL 33601543, at *2 & n.3 (N.D.N.Y. June 8, 1999).
[10] Id.
[11] Id. citing Permasteelisa CS Corp. v. Airolite Co., LLC, No. 2:06-cv-569, 2008 WL 2491747, at *2-3 (S.D. Ohio June 18, 2008).
[12] Id.
[13] Id.
[14] Id. at *11.

Comments

Post A Comment

Logik — Logik Gets New Digs

Logik Gets New Digs Posted By Andy Wilson on September 4, 2009

Logik is moving downtown this month!  We will miss our Dupont office, especially since it was our first office and we put so much work into it.  But, every growing company has to move on and seek new office space to accommodate that growth.  When we first set out to find our new home we knew we wanted something different.  Just as we did with our Dupont location, we also needed it to be bigger and definitely have more than one bathroom.

After a long search we found our new home at 707-709 G Street (it’s 2 buildings merged into 1).  The space was designed and built by Faison architects, the former tenants.  They put a ton of work into the space to make it open and inviting, so we didn’t need to do much more than paint it.  We can comfortably fit 40 or so Logik people in the space and we plan on doing just that in the next 4 years.  Here are the pics:

Comments

Post A Comment

Logik — Loose Clicks Sink Ships

Loose Clicks Sink Ships Posted By Daniel Kaiser, Esq. on August 31, 2009

Need Another Reason to Review your Information Management System? You don’t think so? Here’s one anyway provided on June 4, 2009 by the U.S. Court of Appeals (8th Cir.): American Boat Company, Inc. v. Unknown Sunken Barge.[1]

This case really should be subtitled “Are You Being Served?” – although it sadly lacks in ironic humor or the English accents.

In February a towboat company called American Boat lost one of its towboats, to the tune of $3 Million in damages, in a collision with a hidden submerged barge on the lower Mississippi River. American Boat brought an action against the United States alleging negligence for failure to maintain a navigable channel. Facing a district court summary judgment for the U.S., counsel for American Boat filed a Motion to Amend Judgment or in the Alternative for Reconsideration.

At this point someone somewhere seems to have dropped the ball.

The District Court issued an adverse final ruling on both of American Boat’s motions. The court notified local counsel of this ruling through email only, via their new electronic notification system, a common practice when attorneys have signed up for a court’s electronic notification system. In this case, American Boat’s local counsel (but not their trial counsel) had signed up for electronic notification. American Boat apparently never received this message which means they failed to appeal this final order within the allowable period. Four months later, too late to file an appeal, American Boat’s attorneys learned of the adverse final ruling through the court’s Public Access to Court Electronic Records (PACER) website. Counsel for American Boat cried foul.

First, it was of no importance that the court sent American Boat’s trial counsel neither electronic nor written notice. The court said it sent electronic notice to the party’s local counsel, thus the party was deemed to be on notice. American Boat’s counsel filed affidavits claiming that the office of local counsel never in fact received the court’s automatically generated email notification. The court kept the denials rolling; finding that counsel had received timely electronic notice as reflected by court docket, the court denied American Boat’s Motion to Reopen. Forced to contend with a presumption of delivery for email sent by the court’s electronic notification system,[2] only an appeal to the Eighth Circuit won American Boat an evidentiary hearing to determine whether the email in question was ever truly received. Following the hearing, once again, American Boat’s Motion to Amend Judgment or for Reconsideration was denied. Expert analysis of counsel’s computers brought the court to the conclusion that although no sign of the email notification could be found on counsel’s hard drives, the law firm had in fact received notice of the district court’s ruling as reflected in the court’s docket entry.

Digging deeper, it seems that a staff member working for local counsel had the job of checking both her own email account and that of American Boat’s attorney. Expert examination of this staff member’s office hard drive revealed no trace of the court’s email Notice. On the other hand, evidence was presented that the Notice was correctly addressed by the court and was successfully received by the law firm’s ISP server. The law firm used an email software program that would POP mail from their ISP’s server to their own local storage once that email is accessed. By default such programs then delete the original record from the ISP server, meaning the email then only exists in its final point of destination. The firm had not changed this default setting, and any item of email resided only on the hard drive of the computer used to access it. Oddly enough, although the staff member sometimes used the firm’s front-desk computer to check emails, nobody went to the trouble to examine that particular hard drive. If the staffer opened the court’s email Notice from the front desk, this Notice would have never made it to her own office computer. An expert witness testified “95 percent” certainty that this is what, in fact, happened.

Based on such evidence, the 8th Circuit upheld the district court’s determination that American Boat did receive the court’s Notice, thus the Motion to Reopen was correctly denied. The 8th Circuit held that “[o]nce the electronic notifications reached the ISP, they were available to local counsel for American Boat, in the same way that a letter that has reached a post office box becomes available to the owner of that box.”[3] The court placed the burden to rebut the presumption of delivery of Notice upon plaintiffs,[4] and providing proof that one computer used by the law firm held no trace of said Notice was not proof adequate to rebut the presumption. Failing to rebut the presumption, American Boat bore the consequences.

So just in case you needed another reason to review your Information Management System, do you know if you’re being served? What you don’t know can really hurt you. Modifying your email software program and tweaking your backup storage policy to ensure retention of your records are small ways to avoid big pain.

[1] No. 08-2166 (8th Cir. 2009).
[2] See Kennell v. Gates, 215 F.3d 825, 829 (8th Cir. 2000).
[3] American Boat Company, Inc. v. Unknown Sunken Barge, No. 08-2166 (8th Cir. 2009).
[4] Am. Boat Co., 418 F.3d at 914 (8th Cir. 2005).

Comments

Post A Comment

Logik — Searching…Sorting Through the Tool Box

Searching…Sorting Through the Tool Box Posted By Daniel Kaiser, Esq. on August 28, 2009

You may know what you’re looking for, but do you know how to look?

For those who are engaged in eDiscovery, two cases touching on search methodologies that have held our attention over the past year include Magistrate Judge John Facciola’s decision in U.S. v. O’Keefe[1], and Magistrate Judge Paul Grimm’s decision in Victor Stanley, Inc. v. Creative Pipe, Inc.[2]Facciola’s harangue regarding the complex nature of ESI searches may have assured his immortality, and it is too good to resist quoting yet again:

Whether search terms or ‘keywords’ will yield the information sought is a complicated question involving the interplay, at least, of the sciences of computer technology, statistics and linguistics…. Given this complexity, for lawyers and judges to dare opine that a certain search term or terms would be more likely to produce information than the terms that were used is truly to go where angels fear to tread.[3]

In this vein, Facciola noted that searching was best left to the experts.  On the other hand, Grimm, emphasizing cross-party collaboration, sees the creation of search protocols as potentially falling within attorney competency – so long as the attorney has performed quality assurance testing on the methodology selected, can explain the rationale for selecting the methodology, and can show proper implementation.[4]

Facciola and Grimm come by their wariness honestly.  The Text Retrieval Conference (TREC) series is a research body co-sponsored by the NIST and the IARPA (Logik is a 2009 TREC participant).  The TREC Legal Track “focuses on evaluation of search technology for discovery of electronically stored information in litigation and regulatory settings.”[5]  The Overview of the TREC 2008 Legal Track reports that “the consensus Boolean query found 42% of the highly relevant documents, on average per topic, . . . [and 33% ] of all relevant documents.”[6] Further, negotiated Boolean keyword searches were found to be on par with the newer and more complex search methods tested.[7]  In fact, keyword searches can be notably strengthened when they are performed in an iterative fashion: sampling the search results, and then adjusting the negotiated keywords to improve the results.  Yet it has been observed that although various search methodologies may return a comparable percentage of recall, the actual responsive documents retrieved varies – allowing a higher rate of recall through the use of mixed search technologies on the same data set.[8]

This emerging data, along with recent judicial enthusiasm for the incorporation of concept searching[9], reinforces the idea that attorneys need develop a comfortable working knowledge of the array of electronic data search technologies.  The following non-exclusive list of search methodologies and vocabulary is intended as a reference for those who are finding their way through the etymological wrangle and getting to know the eDiscovery landscape:

  • Keyword Search:  A search through a body of data for a stipulated word or set of words.  Keywords are useful in finding documents containing a specific term.
  • Boolean Search:  Keyword searching with the aid of Boolean operators such as “AND”, “OR”, “NOT”, “W/#”, “( )”, “NEAR”, “TI=( )”, “BEFORE”, “AFTER”, “*”, or “!” (proximity designators, phrase designators, sequencing instructions, and word-trunk expanding instructions) to increase the searcher’s precision in included or excluded results.
  • Fuzzy Logic:  A search method using non-exact word matching to capture results that include variations of, or misspellings of stipulated search terms. 
  • Concept Search:  The use of sophisticated (and often proprietary) mathematical and linguistic analysis to return results pertaining to the concept and context suggested by your search term(s).  The concept upon which your search results are based may or may not be literally present in your search terms, or in your search results.
  • Algebraic Search:  A search using mathematical models, including Boolean proximity operators, to interpret meaning in a document and to retrieve results accordingly.
  • Clustering:  The grouping of documents with related content into “clusters,” within which documents are often given a statistical ranking in their relationship to a template or seed document.  These documents may be found to be related through an overlap of concepts and contexts, or through an overlap of specific terms.  The use of this search method may provide the searcher access to the entire cluster, or may provide the searcher with related, alternative search terms.
  • Concept and Categorization Tools:  Search methods based on the use of a given thesaurus to return results from documents that express the same concept contained in the search term(s), in an alternative fashion.
  • Linguistic Methods:  Search methodologies that classify or select text documents based on a given taxonomy, ontology, or thesaurus.
  • Naive Bayes Classifier:  Based on the Bayesian theorem, a predictive relevance value is assigned to particular words according to their interrelationships, recurrence, a word’s position within a document, and proximity to other search terms.
  • Ontologies:  An ontology is similar to a taxonomy, but the relationships between terms need not be hierarchical and are broad (including synonyms and associated ideas).  Using this search methodology, a searcher entering the term “tort” could pull results from documents containing the terms “litigation” or “damages.”
  • Probabilistic Latent Semantic Analysis:  In brief, this method of analysis (or indexing) uses a probabilistic model to retrieve text containing polysemy (words having multiple similar meanings) and synonymy (words having the same meaning).
  • Probabilistic Search Models (including Bayesian Classifiers):  Probability formulas, including Bayesian methods, are used to determine the relevance of documents within a search pool – often incorporating a term’s historical relevance to the particular search performed to rank the search results.
  • Social Network Analysis:  An analysis and mapping of the interactions or associations amongst sets of nodes (actors, people, entities, information sources) into a complex grid representation of a network.  Significance may be found in various factors such as the centrality of a node.
  • Taxonomies:  The hierarchical classification of terms and ideas into categories or sets, and subcategories or subsets.  The use of this tool enables the searcher, for example, to retrieve results from any subcategory of their search query.  A search for “tort” could pull results from documents containing the terms “negligence” or “nuisance.”
  • Vector Space Retrieval:  A search methodology based on the Vector Space Model.  This method measures the similarity between documents, premised upon the idea that similarity may be used to indicate relevance.  The model represents various documents as vectors in space, with those deemed to be more similar being positioned closer together in space. 

[1] U.S. v. O’Keefe, 537 F. Supp. 2d 14 (D.D.C. 2008).
[2] Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251 (D. Md. 2008).
[3] U.S. v. O’Keefe, 537 F. Supp. 2d 14, 24 (D.D.C. 2008).
[4] Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251 (D. Md. 2008) (citing The Sedona Conference Best Practices Commentary on the Use of Search & Information Retrieval Methods in E-Discovery, 8 Sedona Conf. J. 189 (2007)).
[5] http://trec.nist.gov/pubs/trec17/papers/LEGAL.OVERVIEW08.pdf.
[6] http://trec.nist.gov/pubs/trec17/papers/LEGAL.OVERVIEW08.pdf at 5.
[7] Jason Krause, In Search of the Perfect Search, A.B.A.J. (Apr. 2009), http://www.abajournal.com/ magazine/in_search_of_the_perfect_search/.
[8] Id.
[9] See Disability Rights Council of Greater Wash. V. Wash. Metro. Area Transit Auth., 2007 WL 1585452 (D.D.C. June 1, 2007).

Comments

Post A Comment

Logik — Is eDiscovery Processing a Commodity?

Is eDiscovery Processing a Commodity? Posted By Andy Wilson on August 26, 2009

Lately I’ve heard quite a few people in the eDiscovery industry throw around the word commodity when discussing processing.

Which begs the question, is eDiscovery processing really a commodity?

If so, why so?

If not, why not?


I’ll start.  First things first, my company (Logik), does just eDiscovery processing, so I have an obvious biased response, but I think it’s a logikal one.  Just because the market is saturated with over 600 eDiscovery vendors, buyers of eDiscovery services may find it hard to see any real difference and thus focus on price alone.  Which supports the commodity frame of mind.  But, unlike grocery stores (that sell commodity goods), saturation in eDiscovery does not equal sameness.

As document formats continue to change and new eDiscovery services come about from processing (think: how do you collect and process data in the “Cloud”?), processing becomes more and more dynamic.  This constant change in technology platforms alone makes processing a non-commodity service. 

In order for eDiscovery processing to truly become a commodity I think it needs to become indistinguishable from any other vendors product and technology offering.

Has your experience been the same with every eDiscovery vendors technology and product?  Why do we still fill out RFPs about our technology, product, and service?  If everything was equal among vendors…we wouldn’t need to do these.  Price alone would determine everything.

This is my opinion, what’s yours?

Comments

Post A Comment

Logik — Litigation Watch – the Fourth Circuit Whacks a Hack

Litigation Watch – the Fourth Circuit Whacks a Hack Posted By Daniel Kaiser, Esq. on August 25, 2009

Earlier this spring, the United States Court of Appeals for the Fourth Circuit took a notable stand to strengthen the Stored Communications Act (SCA).

In Van Alstyne v. Electronic Scriptorium[1] the court broke new ground in SCA litigation by ruling that a civil litigant may be awarded attorney’s fees and punitive damages even in the absence of any proof of actual damages, although statutory damages will be withheld. In this case the court found that for more than a year a former employer had repeatedly accessed his former employee’s personal email account (as opposed to a company account with privacy waivers) – thus violating her personal privacy.

The operative SCA subsection reads as follows:

(c) Damages.— The court may assess as damages in a civil action under this section the sum of the actual damages suffered by the plaintiff and any profits made by the violator as a result of the violation, but in no case shall a person entitled to recover receive less than the sum of $1,000. If the violation is willful or intentional, the court may assess punitive damages. In the case of a successful action to enforce liability under this section, the court may assess the costs of the action, together with reasonable attorney fees determined by the court.[2]

The court noted that the SCA’s provision for punitive damages and attorney’s fees, found in the second and third sentence of this subsection, lacks the limiting language “actual damages suffered” that can be seen in the SCA’s provision for statutory damages, found in the first sentence of this subsection. 

Interestingly, this is a case in which the former employee had initiated employment claims against her former employer, and it seems that the former employer wanted to conduct his own unofficial, hacker-style eDiscovery. Although one would think it goes without saying, eDiscovery is discovery; rules do apply!

[1] 560 F.3d 199 (4th Cir. 2009).
[2] The Stored Communications Act 18 U.S.C. § 2707(c) (1986).

Comments

Post A Comment

Logik — Did “One Size Fits All” Ever Really Work?

Did “One Size Fits All” Ever Really Work? Posted By Daniel Kaiser, Esq. on August 24, 2009

The Final Report on the Joint Project of the American College of Trial Lawyers (ACTL) Task Force on Discovery and The Institute for the Advancement of the American Legal System (IAALS)

Change is in the air.  On March 20, 2009, an eighteen-month collaboration between the ACTL and the IAALS came to fruition through their joint-release of 29 Principles─marking the launch of a new nationwide movement to reform both federal and state rules of civil procedure.  The Report includes Proposed Principles touching on eDiscovery (see specifically Principles 12 – 18), and in the coming months these two organizations, together with contributing members of the top echelon of the American and Canadian Trial bar, will be working to assist the implementation of these Principles into pilot projects in the U.S. civil justice system.

Flagging inefficiencies, disproportionate costs and delays, the Final Report emphasizes that the civil justice system is “in serious need of repair,” and that “[t]he traditional ‘one size fits all’ application of uniform rules to all cases . . . no longer works.”  Many of us are left to wonder if, in fact, it ever really worked.  We can watch for these 29 Principles, together with the Sedona Principles, to be instrumental in the retooling of the rules of civil procedure across the United States.

  • A complete list of the 29 Proposed Principles follows:
    1. The “one size fits all” approach of the current federal and most state rules is useful in many cases but rulemakers should have the flexibility to create different sets of rules for certain types of cases so that they can be resolved more expeditiously and efficiently.
    2. Notice pleading should be replaced by fact-based pleading.  Pleadings should set forth with particularity all of the material facts that are known to the pleading party to establish the pleading party’s claims or affirmative defenses.
    3. A new summary procedure should be developed by which parties can submit applications for determination of enumerated matters (such as rights that are dependent on the interpretation of a contract) on pleadings and affidavits or other evidentiary materials without triggering an automatic right to discovery or trial or any of the other provisions of the current procedural rules.
    4. Proportionality should be the most important principle applied to all discovery.
    5. Shortly after the commencement of litigation, each party should produce all reasonably available nonprivileged, non-work product documents and things that may be used to support that party’s claims, counterclaims or defenses.
    6. Discovery in general and document discovery in particular should be limited to documents or information that would enable a party to prove or disprove a claim or defense or enable a party to impeach a witness.
    7. There should be early disclosure of prospective trial witnesses.
    8. After the initial disclosures are made, only limited additional discovery should be permitted.  Once that limited discovery is completed, no more should be allowed absent agreement or a court order, which should be made only upon a showing of good cause and proportionality.
    9. All facts are not necessarily subject to discovery.
    10. Courts should consider staying discovery in appropriate cases until after a motion to dismiss is decided.
    11. Discovery relating to damages should be treated differently.
    12. Promptly after litigation is commenced, the parties should discuss the preservation of electronic documents and attempt to reach agreement about preservation.  The parties should discuss the manner in which electronic documents are stored and preserved.  If the parties cannot agree, the court should make an order governing electronic discovery as soon as possible.  That order should specify which electronic information should be preserved and should address the scope of allowable proportional electronic discovery and the allocation of its cost among the parties.
    13. Electronic discovery should be limited by proportionality, taking into account the nature and scope of the case, relevance, importance to the court’s adjudication, expense and burdens.
    14. The obligation to preserve electronically-stored information requires reasonable and good faith efforts to retain information that may be relevant to pending or threatened litigation; however, it is unreasonable to expect parties to take every conceivable step to preserve all potentially relevant electronically stored information.
    15. Absent a showing of need and relevance, a party should not be required to restore deleted or residual electronically-stored information, including backup tapes.
    16. Sanctions should be imposed for failure to make electronic discovery only upon a showing of intent to destroy evidence or recklessness.
    17. The cost of preserving, collecting and reviewing electronically-stored material should generally be borne by the party producing it but courts should not hesitate to arrive at a different allocation of expenses in appropriate cases.
    18. In order to contain the expense of electronic discovery and to carry out the Principle of Proportionality, judges should have access to, and attorneys practicing civil litigation should be encouraged to attend, technical workshops where they can obtain a full understanding of the complexity of the electronic storage and retrieval of documents.
    19. Requests for admissions and contention interrogatories should be limited by the Principle of Proportionality.  They should be used sparingly, if at all.
    20. Experts should be required to furnish a written report setting forth their opinions, and the reasons for them, and their trial testimony should be strictly limited to the contents of their report.  Except in extraordinary cases, only one expert witness per party should be permitted for any given issue.
    21. A single judicial officer should be assigned to each case at the beginning of a lawsuit and should stay with the case through its termination.
    22. Initial pretrial conferences should be held as soon as possible in all cases and subsequent status conferences should be held when necessary, either on the request of a party or on the court’s own initiative.
    23. At the first pretrial conference, the court should set a realistic date for completion of discovery and a realistic trial date and should stick to them, absent extraordinary circumstances.
    24. Parties should be required to confer early and often about discovery and, especially in complex cases, to make periodic reports of those conferences to the court.
    25. Courts are encouraged to raise the possibility of mediation or other form of alternative dispute resolution early in appropriate cases.  Courts should have the power to order it in appropriate cases at the appropriate time, unless all parties agree otherwise.  Mediation of issues (as opposed to the entire case) may also be appropriate.
    26. The parties and the courts should give greater priority to the resolution of motions that will advance the case more quickly to trial or resolution.
    27. All issues to be tried should be identified early.
    28. These Principles call for greater involvement by judges.  Where judicial resources are in short supply, they should be increased.
    29. Trial judges should be familiar with trial practice by experience, judicial education or training and more training programs should be made available to judges.

The full text of this final report can be found here.

As a practice tip it is important to keep in mind that these are, at present, Proposed Principles.  Yet change is in the air, and this just might be a peek into the not-too-distant future of the American system of civil justice. 

Comments

Post A Comment

Logik — ALI Stung like a Bee

ALI Stung like a Bee Posted By Daniel Kaiser, Esq. on August 21, 2009

Microsoft and Linux urged ALI to Float like a Butterfly - ALI Stung like a Bee

Citing a need for flexibility of commercial law and freedom of contract, and hoping for a lighter touch, the Linux Foundation’s and Microsoft’s recent jointly sent open letter to the American Law Institute (ALI) urged a reconsideration of the ALI’s pending Principles of the Law of Software Contracts. Although competitors in the market, the two software providers came together to point out that the language of the ALI’s forthcoming Principles discriminated among business models, that it would be harmful to the climate of the law surrounding software provision and for related services and support, and that its release should be delayed to allow further input from the software development and user community.

Microsoft and the Linux Foundation took issue primarily with § 3.05(b), which calls for a non-disclaimable implied warranty of no material hidden defects for all transferors that receive “money or a right to payment of a monetary obligation in exchange for the software . . .” The Linux Foundation points out the ambiguity of the concept of “free” under this language, given that providers of “free software” may yet be able to obtain payment (for example, through advertisement delivery or support services). Collectively, the authors cite inconsistencies of the ALI’s draft with the Uniform Commercial Code, general commercial law, and public policy. The authors of this open letter urge that the implied warranty of no material hidden defects should continue to be disclaimable, claiming that this “would cover individual contributors to open source projects, as most open source licenses disclaim warranties and indicate that the software code is provided ‘as is’.”

Despite the united appeals of industry players coming from opposite ends of the software licensing spectrum, in May the ALI unanimously approved the final draft of the Principles of the Law of Software Contracts. Perhaps motivated by the desire to place the risk of defective software on the party best able to manage that risk, § 3.05(b) as referenced above remains applicable in the approved draft. The membership of the ALI might argue that this section only applies in situations where the provider is aware of a material defect that is hidden from the consumer, yet software providers may well be concerned by the unpredictability of a court’s application of “hidden” or “material.”

One way or another, with the highly persuasive nature of the ALI’s Principles, courts are likely to start applying this particular release soon. You can read the Principles of the Law of Software Contracts for yourself, but don’t hold your breath for an open source version. The ALI’s website offers a download – but, of course, for a price.

Comments

Post A Comment

Logik — Will Sotomayor Weigh-In on eDiscovery?

Will Sotomayor Weigh-In on eDiscovery? Posted By Daniel Kaiser, Esq. on August 20, 2009

Any edition of the high-court shuffle will always attract attention. Although it is rare to see the Supreme Court ruling specifically on a question of eDiscovery, Court watchers have been interested to see how the addition of Sotomayor might influence the Court in the event of a relevant controversy. Having specialized in intellectual property while working with the firm of Pavia & Harcourt, and having at times touched on technology over the course of her more than 150 decisions, appeals court judge Sonia Sotomayer has created a record worth speculating over.

Bringing her history in intellectual property to bear, Sotomayer appeared comfortable in technology-based cases when wrote a few Anticybersquatting Consumer Protection Act cases in the early part of this decade. Examples can be found in Storey v. Cello Holdings, L.L.C.[1] and Mattell, Inc. v. Barbie-Club.com.[2]

Perhaps most specifically relevant to eDiscovery was Sotomayor’s opinion in Leventhal v. Knapek.[3] This wasn’t an eDiscovery case per-se, yet it touched on closely related issues. Here the U.S. Court of Appeals for the Second Circuit (located in New York City) considered and rejected a challenge to the actions of a public employer in its search of an employee’s office computer for evidence of alleged work-related misconduct. Plaintiff Gary Leventhal worked as an accountant for the New York State Department of Transportation (DOT). An anonymous letter, in which plaintiff was not mentioned by name, complained of various acts of job-related misconduct in the DOT’s accounting department including tardiness, absence, and excessive personal pursuits and conversations during DOT work time. In response, the DOT conducted its own variation of eDiscovery, compelling a search of plaintiff’s office computer and the computers of other accounting employees for “non-standard” computer programs. In the face of a Fourth Amendment challenge brought by plaintiff, Sotomayor considered that although a public employee has a “reasonable expectation of privacy in the contents of his office computer,” in this case the search and seizure did not violate plaintiff’s rights to due process.

The court acknowledged that “the Fourth Amendment protects individuals from unreasonable searches conducted by the Government, even when the Government acts as an employer.”[4] Sotomayor went on to clarify that “[t]he ‘special needs’ of public employers may, however, allow them to dispense with the probable cause and warrant requirement when conducting workplace searches related to investigations of work-related misconduct.”[5] Finally, Sotomayor stated that “[a] public employer’s search of an area in which an employee had a reasonable expectation of privacy is ‘reasonable’ when ‘the measures adopted are reasonably related to the objectives of the search and not excessively intrusive in light of’ its purpose.”[6]

In this context the Second Circuit ruled the DOT’s search of an employee’s office computer to be justified at its inception and reasonable in its scope – finding that “the searches of his computer were ‘reasonably related’ to the DOT investigation of allegations of [plaintiff’s] workplace misconduct.”

[1] Storey v. Cello Holdings, L.L.C. 347 F.3d 370 (2d Cir. 2003).
[2] Mattel, Inc. v. Barbie-Club.com, 310 F.3d 293 (2d Cir. 2002).
[3] Leventhal v. Knapek, 266 F.3d 63 (2d Cir. 2001).
[4] Nat’l Treasury Employees Union v. Von Raab, 489 U.S. 656, 665 (1989).
[5] Citing O’Connor v. Ortega, 480 U.S. 709, 719-26 (1987) (plurality opinion); id. at 732 (Scalia, J. concurring).
[6] Citing O’Connor, 480 U.S. at 726 (plurality opinion) (internal quotation marks omitted).

Comments

Post A Comment

Logik — Just how itchy is that trigger finger?

Just how itchy is that trigger finger? Posted By Daniel Kaiser, Esq. on August 19, 2009

Our tip of the hat to Ralph Losey for his early comments on Phillip M. Adams & Associates, L.L.C., v. Dell, Inc., a recent case that has been turning heads everywhere. This case is certainly worth a read, and although it touches on a topic covered in one of our earlier posts, the outcome was surprising enough to be worth exploring again.

This was a case in which the defendant was sanctioned for not implementing a litigation hold, thus eliminating emails and data dated as far back as 1999. The catch: the defendant apparently did not receive notice from the plaintiff of a potential infringement claim until 2005, and claims to have implemented a litigation hold from that point forward. The Utah Magistrate judge reasoned that the entire computer and component manufacturing industry were essentially on notice of potential litigation (and as a result their litigation holds should have been triggered) in 1999 due to the presence of class action lawsuits against certain players in the industry in 1999 and 2000 based upon claims of defects in floppy disk controllers.

Moving on, the Magistrate found the defendant’s electronic information system architecture, which relied upon each employee to archive or delete their own emails and documents according to company practices, to be unreasonable. The Magistrate referred to the company’s system architecture as one “of questionable reliability which has evolved rather than been planned . . . .”[1] Even more to the point, the Magistrate did not find the defendant to be in possession or control of a coherent document retention policy.

At stake–the ability of companies to rely on the Safe Harbor protections of FRCP 37(e) which reads:

(e) Failure to Provide Electronically Stored Information.  Absent exceptional circumstances, a court may not impose sanctions under these rules on a party for failing to provide electronically stored information lost as a result of the routine, good-faith operation of an electronic information system.[2]

A few problems leap out of the pages of this ruling:

  • Ralph Losey does a nice job of pointing out that per Rule commentary and case precedent, the good-faith requirement of Rule 37(e) refers to destruction of electronically stored information prior to the triggering of a duty to preserve (rather than the subjective reasonability of the electronic information system itself). He also makes the point that Sedona does not indicate reasonability adjudications for records management systems as a prerequisite for Rule 37(e) protection. On the other hand, one should keep in mind that Sedona does mandate reasonability in email retention policies by saying that “[w]hatever their form, email retention policies must be reasonable in purpose and reasonable as applied.”[3]
  • It is also interesting to note that even if there had been class actions in the industry, they had been directed against different companies for a product defect. If the defendant’s trigger for a litigation hold had been tripped, by the Magistrate’s reasoning it would be for an issue different than the one in question: patent infringement.
  • By introducing dubious triggers to the litigation hold, this decision tends to weaken the Rule 37(e) Safe Harbor and promotes potentially cumbersome wide-angle data retention any time tangentially related lawsuits are taking place in a particular sector of the market.
  • Based on the record in the text of the opinion, why didn’t the defendant do a little more to set up the elements of their Rule 37(e) defense to begin with?

On that last point, the Utah Magistrate judge seemed to be treading on more familiar ground in charging that “[the defendant] offers no statements from management-level persons explaining its practices, or existence of any policies.”[4] If not, why not?

The Magistrate went on to reference Guideline 1 of The Sedona Guidelines: Best Practice Guidelines & Commentary for Managing Information & Records in the Electronic Age (November 2007) as follows: “[a]n organization should have reasonable policies and procedures for managing its information and records.”[5] This built up to the Magistrate judge’s conclusion that “[t]he absence of a coherent document retention policy is a pertinent factor to consider when evaluating sanctions.”[6]

Even if the ruling comes across, on the whole, as heavy handed, and even if this decision is reversed on appeal, these final points are important. While it’s an open guess as to what the Magistrate judge may consider to be a “coherent” document retention policy, (or, for that matter, whether or not most of us would agree with him,) a management-level explanation of the defendant’s practices and policies does not seem hard to deliver. The defendant claimed Rule 37(e) Safe Harbor protection against sanctions for pre-litigation elimination of electronically stored information. When making such a claim, always remember to build the elements of your defense in advance through careful implementation and oversight of a company-wide document retention policy.[7] Then, when you need them, you can argue the elements supporting your Rule 37(e) defense piece by piece.


[1] Phillip M. Adams & Associates, L.L.C., v. Dell, Inc., 2009 WL 910801 (D.Utah March 30, 2009).
[2] Fed. R. Civ. P. 37(e).
[3] The Sedona Conference Working Group on Electronic Document Retention & Production (WG1), The Sedona Conference Commentary on Email Management: Guidelines for the Selection of Retention Policy, 8 The Sedona Conference Journal 239, 240 (Fall 2007).
[4] Phillip M. Adams & Associates, L.L.C., v. Dell, Inc., 2009 WL 910801 (D.Utah March 30, 2009).
[5] Id.
[6] Id.
[7] See ESI Maintenance – Sailing the Safe Harbor, posted March 12, 2009.

Comments

Post A Comment

Logik — How to make a quick-n-dirty histogram

How to make a quick-n-dirty histogram Posted By Adam Reilly on August 19, 2009

Most people know that Microsoft excel has the capability to produce a wide variety of charts in order visualize data.  However, if you find yourself needing to summarize more rows than excel can load or you need to use SQL to provide more flexible data manipulation, Microsoft Access also provides a function called “pivot charts” which allows users to generate quick visual summary of queries.

We’ll start by importing a sample set of data which was obtained from the Internet.  The data is in the form of Comma Separated Values, or CSV which is a common data interchange format.

Once the data is imported into it’s own table, we’ll create a new query in design view.  This is a key step, as pivot charts are designed to visualize data from queries.

  • We’ll add the table
  • Select some columns that we’d like to summarize
  • And be sure that the ‘Group By’ feature is turned on
  • Let’s assume we want to know which position played in the most games for this data set
  • Finally, we’ll preview the query

Now that our query is created, we’ll save it with a sensible name.

In order to display this data, we’ll need to select the view menu and select ‘Pivot Chart view’

The Chart field list appears with the fields from our query.  Clicking the drop down at the bottom of the field palate shows the different areas of the chart where data can be added.  Since we want to know how many games each position played, we’ll add Position to categories and sum of games played to the data area.  We can easily see from this summary that the outfielders played more games than any other position.

There is a wide array of chart types to choose from and these reports can be included in reports or other office documents to add impact.  It’s a quick and easy way to add visualization to data-based summaries.

Comments

Post A Comment

Logik — Logik Named to Inc. 500 List of Fastest Growing Companies

Logik Named to Inc. 500 List of Fastest Growing Companies Posted By Andy Wilson on August 14, 2009

Out of the 500 companies on this years list, Logik is the top eDiscovery company. Wow! We’re thrilled and humbled.

The Inc. 500 is a list of the fastest-growing private companies in the U.S. This award places Logik in company with past honorees such as Microsoft, Timberland, Intuit, Jamba Juice, Oracle, and Under Armour. We can only hope to reach the same success these companies have achieved.

We couldn’t have done this without the support of our family, friends, partners, vendors, and our ahhh-mazing clients.  Thank you so much to everyone that has supported us over the years. We greatly appreciate it.

Click here to see the Inc. 500 page about our ranking

Here is the formal press release we issued about the ranking:

Logik Named to Inc. 500 List of Fastest Growing Companies

DC-Based eDiscovery Company Ranks #181 on the 2009 Inc. 500

For Immediate Release

WASHINGTON DC – August 12, 2009 – DC’s fastest-growing eDiscovery company, Logik, is thrilled and humbled to announce that Inc. magazine has ranked Logik as number 181 on the “2009 Inc. 500” list of fastest-growing companies in America. Out of the 500 companies on the list, Logik is the top eDiscovery company.  There are two groups to thank: Logik employees and Logik customers. Both are directly responsible for the achievement.

“Logik started out as a desire to do something better,” says Andy Wilson, co-founder and CEO at Logik. “We worked for a long time in the eDiscovery industry, watching the large vendors shuffle along, and we saw the mind-boggling inefficiencies. We saw insanely high rates. We saw a better way and we took a chance on it.”

Logik began in the dining room of Andy Wilson’s apartment as a simple, straight-forward idea. Provide a faster, more accurate and more budget-friendly way to process eDiscovery data. The hard part was inventing the technology.

“The high ranking by Inc. magazine is a really nice compliment to the technologies we’ve developed to help fill the gap in the industry.”

— Sheng Yang, CTO


“Working out of a dining room has its advantages,” laughs Sheng Yang, Logik co-founder and CTO. “Late night after late night we had munchies and soda readily available. While the sugar fueled the work, it was the desire to start something new, interesting and compelling that drove us on day after day. The high ranking by Inc. magazine is a really nice compliment to the technologies we’ve developed to help fill the gap in the industry. What it tells us is that our customers see the value in what Logik offers. And that’s the best compliment there is.”

With themselves as the sum total of two employees in 2005, Andy and Sheng took their eDiscovery data process invention, GridLogik™, from the dining room table to the client table. An incredibly short four years later, Andy and Sheng are still driven to improve the industry they’ve helped to shape. The biggest difference now is that they work from an office in downtown Washington, DC, and keep their twelve employees busy with a ton of work, constant fun and ping-pong matches.

“We know very well the benefits of a creative, relaxed work-environment,” says Andy. “Our flat organization promotes the sharing of ideas and teamwork, team-think and, more important, individual contribution and inventiveness. The only principle we stand on is the one where we deliver exceptionally well for our customers. We have more new ideas coming from our team than we can ever hope to launch this year. So we work with our team to find the best ideas that offer our customers faster results and a better bottom-line. Because frankly, the better our customers do and the happier they are, the better we do as a company. The proof of that simple concept is the growth we continue to experience.”

That’s exactly the formula Logik continues to follow. Fast processing + smart work + happy employees x extremely happy customers = 181. At least, according the 2009 Inc. 500 list.

“If you want to find out which companies are going to change the world, look at the Inc. 500,” said Inc. Editor Jane Berentson. “These are the most innovative, dynamic, fast-growth companies in the nation, the ones coming up with solutions to some of our most intractable ills, creating systems that let us conduct business faster and easier, and manufacturing products we soon discover we can’t live without. The Inc. 500 list is Inc. magazine’s tribute to American business ingenuity and ambition.”

“Though I have to say,” Sheng wistfully admits, “It’s nice to have a dinner party now without a server rack sitting next to the dining room table.”

About Logik:
Logik is an eDiscovery processing company located in Washington, DC. Logik helps corporations, law firms, government agencies and service providers simplify electronic data sought in discovery requests. Logik’s innovative and highly distributed processing platform, Gridlogik, was developed to process all kinds of unstructured and structured data sets such as email databases, spreadsheets, images and MS Office documents. Combined with their transparent pricing model, Logik offers customers the smart way to discover accurate results and make sense of processing costs. Find out more at logik.com.

Media interested in setting up an interview with a representative from Logik should email .(JavaScript must be enabled to view this email address) or call 800-951-5507.

###

Comments

Post A Comment

Logik — In the Cloud, Warrants are for the Birds?

In the Cloud, Warrants are for the Birds? Posted By Daniel Kaiser, Esq. on August 14, 2009

I see skies of blue – and clouds of white – a bright blessed subpoena! You mean warrant, right?

Nope. We respect you for trying, but they meant subpoena. (…what a wonderful world…)

In U.S. v. Weaver, a Seventh Circuit district court addressed the question of whether a court can, via subpoena, compel an Internet Service Provider’s (Microsoft’s) production of a subscriber’s opened emails which are less than 181 days old. 2009 WL 2163478 (C.D.Ill.) This was a case of first impression for the Seventh Circuit, and it clarified Theofel v. Farey-Jones, a previous Ninth Circuit ruling. 359 F.3d 1066 (9th Cir. 2004). Whereas the court in Theofel found that circumstances called for the use of a warrant, the Seventh Circuit in Weaver said that a subpoena would suffice.

In Weaver, seeking Defendant’s Hotmail records, the Government submitted a trial subpoena to Microsoft requiring the production of “the contents of electronic communications [emails] (not in ‘electronic storage’ as defined by 18 U.S.C. § 2510(17)).” The Government specified that this would “include the contents of previously opened or sent email.” Microsoft, in turn, replied that due to its location (headquartered Redmond, Washington, in the Ninth Circuit), it was bound by the Ninth Circuit precedent found in Theofel which required the use of a warrant to obtain such records from an ISP.

The Hon. Jeanne Scott of the Central District of Illinois pointed to the Stored Communications Act 18 U.S.C. § 2701, et seq., and the Wiretap Act 18 U.S.C. § 2510, et seq. to resolve the issue. The language leading to a warrant requirement in Theofel was found in § 2703(a), stipulating that governmental entities requiring disclosure by providers of electronic communications service of electronic communications in electronic storage for 181 days or less must obtain and present a warrant based upon probable cause. Yet on the other hand, subsection (b) allows the government to procure similar emails less than 181 days old that are “held or maintained . . . solely for the purpose of providing storage or computer processing services to such subscriber or customer . . . .”

The question: in Weaver, were Defendant’s emails “in electronic storage” – subject to the warrant requirement – or were they “held or maintained . . . solely for the purpose of providing storage or computer processing services” (etc.) and thus available via subpoena?

As defined by the Wiretap Act, because the emails were opened, the only way they could have been in “electronic storage” would be if they were in storage “for purposes of backup protection of such communication . . . .” 18 U.S.C. § 2510(17)(B). This is where the facts in Weaver differ from those in Theofel. In Theofel the Ninth Circuit was addressing an email system in which users downloaded messages from their ISP to their local hard drive. With such systems, residual copies of an already-downloaded email remaining on the ISP’s server could be kept for backup protection until the user’s copy “expire[s] in the normal course.” Theofel, 359 F.3d at 1070. Yet today we’re seeing more and more use of web-based (cloud-based) email systems.

In Weaver, this was the situation addressed by the Seventh Circuit. Here, the Defendant was using Microsoft’s cloud-based Hotmail email. The Ninth Circuit itself, in Theofel, pointed out that “[a] remote computing service might be the only place a user stores his messages; in that case, the messages are not stored for backup purposes.” Id.

So both courts appear to agree that web-based email falls into the provisions of § 2703(b), meaning the government is free to compel production from an ISP via subpoena. Or, per the same subsection, the government may even compel production without notice if it wishes to secure a warrant. But going even further, the Seventh Circuit faulted the Ninth Circuit’s analysis as “unpersuasive” and as out of step with “legislative history and other provisions of the Act.” In drafting the Stored Communications Act, the drafters noted the case in which an addressee receives an email yet chooses to leave it in storage on the ISP’s server for later re-access. The drafters said that “such communication should continue to be covered by section 2702(a)(2)” – a section that reads identically to the provision allowing the Government access by trial subpoena.

In light of statements that have come from the executive branch along with various other court decisions referring to a diminished “expectation of privacy” in the use of cloud-based computing, the lowered level of Due Process protection represented by this decision really isn’t such a surprise.

Comments

Post A Comment

Logik — Sailing the safe harbor

Sailing the safe harbor Posted By Daniel Kaiser, Esq. on August 11, 2009

BRING OUT YOUR DEAD…documents.  If your company goes to court, and your opponent’s discovery request includes dead files or electronic files previously deleted from your archives, have you secured safe harbor protections against court sanctions?  In order to do so, here’s a quick set of guidelines:

  • Adopt a company-wide document retention policy, defining the time frames within which specific categories of documents must be retained (according to file type, local and federal law, and industry standards).
  • Eliminate files only after their defined retention period expires.
  • Consistently implement this policy, throughout your company.
  • In the event that you might reasonably anticipate litigation, implement a “legal hold” policy defining and executing the process by which relevant information is identified, preserved, and maintained for discovery purposes.[1]
  • Enjoy safe harbor protections for files deleted in the course of database management (as defined by your document retention policy), falling outside of the context of the aforementioned legal hold.
  • In this context, an adverse inference or other court sanction for spoliation of evidence would require the following three elements:
    1. that the party having control over the evidence had an obligation to preserve it at the time it was destroyed;
    2. that the records were destroyed with a ‘culpable state of mind’ and
    3. that the destroyed evidence was ‘relevant’ to the party’s claim or defense such that a reasonable trier of fact could find that it would support that claim or defense.[2]

The corporate records you maintain as electronically stored information (ESI)─now including email, voice messages, proposals, sales documents, contracts, legal documents, tax records, employment records, Board minutes, and press releases amongst other important files─are both assets and potential burdens to your company.  Having extensive records at your fingertips will enable smooth operations by informing you in your transactions with existing and potential clients, by allowing market analysis and company forecasts, and potentially by protecting you in the event of a lawsuit.

Yet every coin has its flip-side.  In many companies, massive proliferation of ESI threatens to bog down their storage capacities.  Picture the deluge in physical terms: if one gigabyte of data would, (roughly speaking) require enough pages of print-out to fill a pick-up truck to capacity imagine the one thousand twenty four trucks that would be required to hold a terabyte.  Such volumes are often seen in today’s large enterprises.

Understandably, in this post-Enron and Sarbanes-Oxley era, when in doubt regarding the decision to retain or delete a document, many have chosen to avoid potential liability by opting to retain.  Section 404 of the Sarbanes-Oxley Act of 2002 ties executive liability to, among other things, the presence of effective internal controls.[3]  A reasonable and functioning document retention policy could be a relevant metric in the examination of these controls.  Even more directly, the errant destruction of an electronic file may lead to the inability to produce a document requested during the discovery phase of a lawsuit.  Rule 37 of the Federal Rules of Civil Procedure (Fed. R. Civ. P.) levies sanctions upon parties to a dispute (including an unfavorable default judgment) for a failure to make disclosures or to cooperate in discovery.[4]

This Scylla and Charybdis, the opposing hazards of run-away databases on the one hand and over-zealous ESI culling on the other hand, can be fairly easily avoided.  Fed. R. Civ. P. 37(e) creates the following Safe Harbor for “Failure to provide Electronically Stored Information”:

Absent exceptional circumstances, a court may not impose sanctions under these rules on a party for failing to provide electronically stored information lost as a result of the routine, good-faith operation of an electronic information system.[5]

It is vital to note from the outset that this Rule does not open the doors to old-style record elimination─such an interpretation would simply amount to federal common law spoliation of evidence[6]─yet a Safe Harbor may be sailed via the well-crafted creation of and compliance to a document retention policy.  Two questions that should be asked are: 1) How do I create a Safe Harbor for my company’s elimination of ESI?  2) How do I ensure that my company doesn’t sail beyond the boundaries of the Safe Harbor?

Creating your Safe Harbor begins with the drafting and implementation of your document retention policy.  Your company’s document retention policy (including your digital document retention protocol) should be your best friend.  By creating guidelines regarding the period of time during which different categories of documents should be maintained you will be able to free your hard drives from their accumulation of digital detritus, and at the same time you can ensure that your employees do not eliminate files relevant to a potential litigation (a.k.a. covering your back-side).  In setting the time frames within which a particular file of ESI should be maintained, consideration must be given to local and federal laws,[7] to the standards of the industry within which your company fits (as well as to the type of ESI in question (see the types of corporate records discussed at the beginning of this article).

Next, a company must take care to put a “legal hold” on their routine when the shadow of litigation appears.  Even if a file is scheduled for routine culling, the act of eliminating a piece of relevant ESI in such circumstances would remove the Safe Harbor protections of Fed. R. Civ. P. 37(e), exposing you fully to Rule 37’s sanctions (including the default judgment).  The Sedona Conference Working Group on Electronic Document Retention & Production has produced an excellent commentary entitled The Trigger & The Process.[8]  This commentary clarifies the circumstances in which a legal hold should be placed on ESI, providing eleven useful guidelines expanding upon the following statement:

The law has developed rules regarding the manner in which information is to be treated in connection with litigation.  One of the principal rules is that whenever litigation [or a regulatory investigation or proceeding] is reasonably anticipated, threatened or pending against an organization [or natural person], that organization has a duty to preserve relevant information.  This duty arises at the point in time when litigation is reasonably anticipated whether the organization is the initiator or the target of litigation.[9]

So far, so good – but a policy per se will not be sufficient.  Any compliance team will be quick to point out that the human element can be one of the trickier elements to tackle when implementing a new policy.  As basic as it sounds, a strong dose of oversight is required to ensure that your policy is, in fact, executed.  In the Corporate Counsel Section of the New York State Bar Association annual meeting, panelist Kenneth Rashbaum (of Fios, Inc.) pointed out that the “most critical aspect of record retention policies and e-mails is employee education . . . employees won’t follow what they don’t understand. . . .”[10]  In addition to explaining what records fall into particular categories of your policy, another panelist (Eva L Jerome of Bryan Cave LLP) pointed out that upon implementation of a legal hold, “oral instruction immediately followed by a written one . . .” should be given to all potential data custodians, followed by “ongoing monitoring of compliance, including sending out periodic reminders of the hold and recertifications. . .”[11]

If you need to create, implement, or oversee a digital document retention protocol, discuss these issues with your attorney.  Ensure compliance.  Discover and sail the sheltered waters of your safe harbor.  Put a legal hold on your ESI if the shadow of litigation presents.

[1] www.thesedonaconference.org/content/miscFiles/Legal_holds.pdf at 2.
[2] Zubulake v. UBS Warburg LLC, 229 F.R.D. 422, 430 (S.D.N.Y.2004).
[3] Sarbanes-Oxley Act, Pub. L. No. 107-204, § 116 Stat. 745 (2002).
[4] Fed. R. Civ. P. 37.
[5] Id. at (e).
[6] See, e.g., Silvestri v. General Motors, 271 F.3d 583 (4th Cir. 2001).
[7] See, e.g., Fair Labor Standards Act of 1938.
[8] www.thesedonaconference.org/content/miscFiles/Legal_holds.pdf
[9] Id. at 1.
[10] Alessandra Scalise, Corporate Counsel program reviews e-record management, N.Y. St. B.A. State Bar News, Mar.-Apr. 2009, at 24.
[11] Id. at 27.  See also Zubulake v. UBS Warburg LLC, 229 F.R.D. 422, 432 (S.D.N.Y. 2004).

 

Comments

Post A Comment

Logik — Electronic Document Management Systems in 2009

Electronic Document Management Systems in 2009 Posted By Daniel Kaiser, Esq. on August 10, 2009

AIIM’S Revised Recommended Practices

Get ready for an acronym or two. Oh what the heck, make it seven. No Glossary Needed (NGN).

In June the non-profit Association for Information and Image Management (AIIM), an official ANSI-approved Standards Development Organization, approved and released the updated 2009 version of AIIM ARP-1-2009: Recommended Practice – Analysis, Selection, and Implementation of Electronic Document Management Systems (EDMS).

AIIM focuses on the tools and modes used in Enterprise Content Management (ECM – or elterprise-level data management) standards. This vendor-neutral report was prepared by industry experts and approved under a Standards Board including members of the U.S. District Courts, Microsoft, Adobe Systems Inc., and OpenText. The new practice guidelines address the analysis, selection and implementation procedures associated with electronic document management, starting with a description of the technologies currently being used by companies to store and manage ESI, the report details current industry standards, and finishes up with a review of industry best practices.

In their June 16 report, AIIM quoted former general counsel and securities regulator Virginia Jo Dunlap in stating that “[c]ompanies that will be facing any type of E-Discovery requests should pay close attention to ARP-1 as it provides guidance on the critical first steps toward being able to certify to courts or regulators that the documents produced are accurate.”

A complete copy of this report can be downloaded here.

Comments

Post A Comment

Logik — Passing the bucks – when and why to expect cost-shifting

Passing the bucks – when and why to expect cost-shifting Posted By Daniel Kaiser, Esq. on August 10, 2009

How do you feel about “going Dutch?”  You may or may not have strong feelings about being asked to split a dinner tab, but my money says that you’ll have even stronger feelings about splitting a discovery “tab.”  This is a brief look at when to expect cost-shifting in eDiscovery.

When?

From the outset, keep in mind that eDiscovery cost-shifting is an extraordinary remedy. Court modifications of discovery requests (including cost-shifting) are not a given.  In fact, the benchmark decision of Zubulake 1 points out that in many typical discovery requests a consideration of cost-shifting would be wholly inappropriate.[1]  In general, courts should deny burdensome requests for data in the absence of a reasonable prospect that the data will contribute significantly to discovery.[2]

The remedy is more likely to arise where a request might be burdensome upon the recipient, but that burden is coupled with a justification – a demonstration of substantial need by the requesting party.[3]  While a motion to limit a discovery request or to shift a portion of discovery costs to the requesting party remains a matter of court discretion, clear guidance has been provided in several compelling sources – allowing us the benefit of a few reasonable predictions.

Typically expect cost-shifting when…

  • a party is compelled to recover and produce deleted data – deleted as a result of the routine, good-faith maintenance of an electronic information system;[4]
  • a party is compelled to recover and produce data from recovery / backup tapes;[5]
  • a party is compelled to recover and produce residual data;[6]
  • a party is compelled to recover and produce legacy data;[7]
  • the aggregate volume of data requested outstrips the needs of the requesting party;[8]
  • the requesting party has disproportionately greater resources than the party from whom the data is sought;[9] or
  • there is a lack of reasonable likelihood that the requested evidence will lead to the discovery of admissible evidence.[10]


Don’t expect cost-shifting when…

  • a party may reasonably anticipate litigation, yet failing to place a legal hold on relevant data, that party allowed relevant data to be deleted;[11]
  • in spite of the fact that the production of certain data would be unduly burdensome, a party agreed to a stipulation ordering production of the data in question;[12] or
  • the data requested is reasonably accessible, meaning compliance would not be unduly burdensome or costly.[13]

What?

Where do these factors come from?  What does “reasonably accessible” mean?  In a federal context, eDiscovery requests are at the discretion of the court.  Fed. R. Civ. P. 26 notes that where the production of ESI is found to be unduly burdensome (where the ESI is not reasonably accessible) the court may “specify conditions for the discovery.”[14]  So how do we recognize undue burden or cost, the lodestar for data that is not reasonably accessible?  The Federal Rules of Civil Procedure, The Sedona Principles, and case law all shed light on these questions.

Reasonable Availability and Undue Burden in Context…

  • Rule 26 provides a proportionality standard to be used when a court steps in to specify discovery conditions, incorporating the factors of IT feasibility, balancing its burden or expense against “its likely benefit, considering the needs of the case, the amount in controversy, the parties’ resources, the importance of the issues at stake . . . and the importance of the discovery in resolving the issues.”[15]
  • Principle 13 of The Sedona Conference Working Group on Electronic Document Retention & Production states that “if the information sought is not reasonably available . . . in the ordinary course of business, then, absent special circumstances, the costs of retrieving and reviewing such electronic information may be shared by or shifted to the requesting party.”[16]  The bald use of “reasonably available” and “ordinary course of business” may be vague on its own, but Comment 13.a. provides eight factors to determine whether cost-shifting should occur in the production of burdensome ESI:
    1. whether the information is reasonably accessible as a technical matter without undue burden or cost;
    2. the extent to which the request is specifically tailored to discover relevant information;
    3. the availability of such information from other sources, including testimony, requests for admission, interrogatories, and other discovery responses;
    4. the total cost of production, compared to the amount in controversy;
    5. the total cost of production, compared to the resources available to each party;
    6. the relative ability of each party to control costs and its incentive to do so;
    7. the importance of the issues at stake in the litigation, and
    8. the relative benefits of the parties of obtaining the information.[17]
  • In Zubulake I, judge Scheindlin wrote that “whether production of documents is unduly burdensome or expensive turns primarily on whether it is kept in an accessible or inaccessible format,”[18] and went on to clarify that metric with the following seven-point test:
    1. The extent to which the request is specifically tailored to discover relevant information;
    2. The availability of such information from other sources;
    3. The total cost of production, compared to the amount in controversy;
    4. The total cost of production, compared to the resources available to each party;
    5. The relative ability of each party to control costs and its incentive to do so;
    6. The importance of the issues at stake in the litigation; and
    7. The relative benefits to the parties of obtaining the information.[19]

The first two factors are generally the weightiest, but factor six takes precedence if the case is one of broad, important impact.[20]  This calculus is objective; a sampling of the requested data is required to allow an analysis of these factors.

What you should do…

Review these factors (and any corresponding state/local law) to see whether an eDiscovery request is likely to fall within a precedent for cost-shifting.  It remains vital that in the shadow of anticipated litigation, you maintain viable records of your relevant data.[21]  Be prepared, and take the mystery out of “going Dutch.” 

[1] Zubulake v. UBS Warburg LLC, 217 F.R.D. 309 (S.D.N.Y. 2003) (drawing a distinction between accessible email files on optical discs and less accessible email files on backup tapes).
[2] The Sedona Conference Working Group on Electronic Document Retention & Production (WG1), The Sedona Principles: Second Edition, Best Practices Recommendations & Principles for Addressing Electronic Document Production, Comment 13.b. (June 2007).
[3] Id.
[4] Id. at Comment 13.a.
[5] Id.
[6] Id.
[7] Id.
[8] Id.  See Fed. R. Civ. P. 26(b)(2)(C)(iii).
[9] Id.  See Fed. R. Civ. P. 26(b)(2)(C)(iii).
[10] See Fed. R. Civ. P. 26(b)(2)(C), Fed. R. Civ. P. 26(c), see also Zubulake v. UBS Warburg LLC, 217 F.R.D. 309 (S.D.N.Y. 2003).
[11] See http://www.thesedonaconference.org/content/miscFiles/Legal_holds.pdf.  See also Procter & Gamble Co. v. Haugen, 2003 WL 22080734, No. 1:95CV94 DAK (D. Utah August 19, 2003).
[12] In re Fannie Mae Securities Litigation, 552 F.3d 814 (2009).
[13] Fed. R. Civ. P. 26(b)(2)(B).
[14] Fed. R. Civ. P. 26 (b)(2)(B).
[15] Fed. R. Civ. P. 26 (b)(2)(C).
[16] The Sedona Conference Working Group on Electronic Document Retention & Production (WG1), The Sedona Principles: Second Edition, Best Practices Recommendations & Principles for Addressing Electronic Document Production, at 67 (June 2007).
[17] Id. at Comment 13.a.
[18] Zubulake v. UBS Warburg LLC, 217 F.R.D. 309 (S.D.N.Y. 2003).
[19] Id. at 322.
[20] Id.
[21] See FN 9 supra, and accompanying text.

Comments

Post A Comment

Logik — Reading the pulse of the industry

Reading the pulse of the industry Posted By Daniel Kaiser, Esq. on August 5, 2009

Open wide and say “ahh.” Um-hmm… interesting.

These days we seem to be surrounded by various pronouncements and diagnostics on the health of the economy. Sometimes these seem to be counter-intuitive. Consecutive months of increased spending (rising 0.5%) at the same time as a 1.3% fall in personal income? (Consumer spending rose again in June.) More people are filing first-time claims for unemployment benefits, but the trend is improving? Somehow I think the idea that “the pace of decline [has] moderated” can cut both ways.

This climate has certainly had an effect in the realm of law, litigation and legal services. Newspapers from China to the UK are running stories on the New York information technology graduate who is suing to recover her college tuition after finding herself unemployed. Within the industry itself, we’ve heard plenty of news and advice regarding law firm dissolution and downsizing, layoffs, and associate/staff furloughs. This news often seems to find its way around through rumors and speculation, but a fair amount of advice comes from the ABA itself.

In the world of eDiscovery, the truth is that for those who are willing to keep pace with the cutting edge of technology and the rapid evolution of the law, the opportunities to succeed are not-so-hidden. George Socha and Tom Gelbmann (of the Socha-Gelbmann Electronic Discovery Survey) have pointed out the growth and strength of new, creative and innovative Electronic Data Discovery (EDD) providers with “strong, scalable, sophisticated advanced search tools.” The industry is growing along with them. Survey participants expect the eDiscovery market to expand by “about 30% throughout 2009 and about 25% in 2010.” The continuing increase in the volume of data processed by eDiscovery providers would seem to substantiate these expectations.

Gifted programmers, project managers, and attorneys have been steering their careers in this direction for some time now, finding plenty of scope for career development. Law firms and corporations have also been eager to hire experienced EDD professionals, yet for all the demand and need, the workers seem to be few. Socha and Gelbmann’s survey participants perceive there to be “no more than 100 to 200 lawyers in the entire country [who] really ‘get’ EDD.”

In this industry, opportunities are no longer reserved for seniority. Court decisions are beginning to spring up around the country, to the accompaniment of rewritten portions of the Federal Rules of Civil Procedure. In contrast to the more glacial pace at which most areas of the law develop, the law, substance and methods of eDiscovery have been on a growth spurt – struggling (and in some cases failing) to keep up with an even more rapid development in electronic communications. In fact, a sobering picture of the court’s reaction when an attorney hasn’t kept pace with the latest developments can be seen clearly in Chen v. Dougherty, 2009 WL 1938961 (W.D. Wash. July 7, 2009). Referring to an experienced litigator’s failure to submit search terms, the court said that her “inhibited ability to participate meaningfully in electronic discovery tells the Court that she has novice skills in this area and cannot command the rate of experienced counsel.”

This application of emerging law to an ever-developing sea of potential discovery sources (from social networking engines to cloud-based platforms for applications and data storage) has placed the digitally-native generation at a double advantage: 1) Comfort with the technology through immersion in the amorphous world of electronic social networking, applications and electronic search procedures; and, 2) A rare opportunity to begin a career on something close to level ground (that is, more level than a junior associate’s usual starting position) when it comes to knowing the rules of the game.

Comments

Post A Comment

Logik — Sarek, Spock’s Dad

Sarek, Spock’s Dad Posted By Andy Wilson on August 1, 2009

Logi(k) offers a serenity humans seldom experience in full.

Comments

Post A Comment

Logik — How to hook it up right

How to hook it up right Posted By Adam Reilly on July 31, 2009

Prevent data spoilation by using a simple write-blocking device. They are fairly cheap (~$270 @ tableau.com ) and well worth the price considering spoiling data may just ruin your whole day.

Connecting a hard drive to a computer seems simple enough.  But if you want to avoid modifying the metadata on the drive you will need to use a write-blocking device that prevents the hard drive from updating the metadata.  This is very important, especially for legal discovery where metadata should always be preserved to avoid spoilation.

 

Comments

Post A Comment

Logik — Logik is such a TREC-ee

Logik is such a TREC-ee Posted By Andy Wilson on July 30, 2009

Not the Star Trek kind of Trekky, well, maybe considering the high likelihood most participants (all tech companies) have seen all 7 movies.  No offense, live long and prosper TREC participants.  This TREC is more about going where no text retrieval algorithm has gone before, and less about finding new planets, although you could make an argument for it…but I digress.  TREC stands for the “Text REtrieval Conference” and is co-sponsored by the National Institute of Standards and Technology (NIST) and U.S. Department of Defense.  We are very proud to be a participant in their 2009 TREC study.  So, what does that mean exactly?
image
TREC gave us a set of very LARGE data, The Enron emails, a subpoena, and said figure it out.  Easy enough right?  Maybe for The Enterprise crew this is easy, but for most eDiscovery companies, including us, this is a major challenge and one that has significant meaning.  Our job will be to use what we know about search within discovery and find all the relevant emails and attachments that relate to the subpoena.  This requires more than just your standard set of boolean keyword searches.  We will need to use more powerful text retrieval algorithms to find the needles in the haystack. 

None of the participants are allowed to post their results, even if they find every single document relevant to the subpoena.  Although it’s probably every marketers dream to post the results (assuming they are good), TREC is smart to not allow it.  Each participant is required to publish their results, the tools they used, etc. to TREC by September 7th, 2009.  So, the clock is ticking.  Hopefully, more advanced and accurate methods for text retrieval will come out of this process.  If only the good people at NIST offered up a Netflix-like $1m prize (http://logiik.com/L)...sigh.  Wish us luck.

Here is a brief intro into TREC taken from their website: http://trec.nist.gov/

The Text REtrieval Conference (TREC), co-sponsored by the National Institute of Standards and Technology (NIST) and U.S. Department of Defense, was started in 1992 as part of the TIPSTER Text program. Its purpose was to support research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies. In particular, the TREC workshop series has the following goals:

  • to encourage research in information retrieval based on large test collections;
  • to increase communication among industry, academia, and government by creating an open forum for the exchange of research ideas;
  • to speed the transfer of technology from research labs into commercial products by demonstrating substantial improvements in retrieval methodologies on real-world problems; and
  • to increase the availability of appropriate evaluation techniques for use by industry and academia, including development of new evaluation techniques more applicable to current systems.

TREC is overseen by a program committee consisting of representatives from government, industry, and academia. For each TREC, NIST provides a test set of documents and questions. Participants run their own retrieval systems on the data, and return to NIST a list of the retrieved top-ranked documents. NIST pools the individual results, judges the retrieved documents for correctness, and evaluates the results. The TREC cycle ends with a workshop that is a forum for participants to share their experiences.

This evaluation effort has grown in both the number of participating systems and the number of tasks each year. Ninety-three groups representing 22 countries participated in TREC 2003. The TREC test collections and evaluation software are available to the retrieval research community at large, so organizations can evaluate their own retrieval systems at any time. TREC has successfully met its dual goals of improving the state-of-the-art in information retrieval and of facilitating technology transfer. Retrieval system effectiveness approximately doubled in the first six years of TREC.

TREC has also sponsored the first large-scale evaluations of the retrieval of non-English (Spanish and Chinese) documents, retrieval of recordings of speech, and retrieval across multiple languages. TREC has also introduced evaluations for open-domain question answering and content-based retrieval of digital video. The TREC test collections are large enough so that they realistically model operational settings. Most of today’s commercial search engines include technology first developed in TREC.

http://trec.nist.gov/

Comments

Post A Comment

Logik — Electronic gadget blackout

Electronic gadget blackout Posted By Daniel Kaiser, Esq. on July 29, 2009

Planning to use your electronic exhibits in court? On June, 29, 2009, the United States District Court for the Southern District of New York announced an interim measure that denied attorneys permission to bring their laptop computers (in addition to other electronic devices) through security and into the Courthouse short of a specific court order “authorizing a specific attorney to bring a specific electronic device into the building for a specific proceeding.”

This draconian little measure stemmed from the local judges’ concerns that laptops can contain bombs, and that personal electronic devices can make prohibited recordings during a proceeding. The new procedure came from the Southern District’s Ad Hoc Committee – superimposed upon the Court’s Local Civil Rule 1.8, reading “[n]o one other than court officials engaged in the conduct of court business [or federal prosecutors and defenders] shall bring any camera, transmitter, receiver, portable telephone or recording device into any courthouse or its environs without written permission of a judge of that court.” The big difference between the old rule and the new is that under Local Rule 1.8 it was common for judges to sign blanket orders allowing an array of unspecified electronic devices, but the interim rule requires a whole lot of specificity concerning the who, the what, and the when. Simply put, this means a good deal fewer tools in the courthouse.

Although this new policy affected only the US District Court for New York’s Southern District, it’s the big one. This is the largest of all US District Courts with up to 1/5 of all civil litigation pending before the Federal Courts. You can read about its history here. The oft-quoted Judge Learned Hand presided in this court from 1909 to 1924, and might once again be quoted in this context: “Life is made up of constant calls to action, and we seldom have time for more than hastily contrived answers.” (Speech in New York City, 1952.)

While this measure is in effect, greater advance planning is a must for an attorney wishing to display electronic evidence exhibits or other electronic exhibits ─ or to simply consult your calendar or case notes. On a positive note, it’s likely that this interim measure will never make into the permanent Rules. The Federal Bar Council (among others) is submitting proposed amendments to Local Civil Rule 1.8 relaxing the standard to allow attorneys the use of their electronic devices, but subject to strict use conditions and perhaps subject to the use of a Secure Pass ID. The Southern District’s Ad Hoc Committee on Cell Phones met to receive public comment on July 29.

Comments

Post A Comment

Logik — Logik.com is launched!

Logik.com is launched! Posted By Andy Wilson on July 28, 2009

After six months of development, we are finally launching our new site, which you happen to be on now.  Mind blowing, I know.  Here is the formal press release we issued about the site:

NEW LOGIK WEBSITE ENLIGHTENS, EDUCATES AND ENTERTAINS
Enhanced resources on Logik.com include a rich library of eDiscovery articles, how-to videos, informative blog posts, and a detailed look into Logik’s service offerings and transparent pricing model.

For Immediate Release

WASHINGTON DC, July 28, 2009—A new online destination has arrived for those seeking to learn more about Logik and related eDiscovery information with the redesigned Logik.com website. As an eDiscovery processing company, Logik.com’s goal is simple: get straight to the point on what they do, how they do it and why their clients love to work with them. Sounds like a pretty simple concept, but it’s one that Logik image worked hard to implement and get right. What sets the Logik.com website apart from other eDiscovery company websites is how it makes information easy to find. The site is designed with customers (past, present and future), the media and job-seekers in mind—it’s graphically appealing and easy-to-navigate. Logik.com features insightful articles on the industry, instructional “how-to” and best-practice videos, technology updates, and eye-candy to please all visitors. And it features Logik’s new mascot for eDiscovery processing services: Logikbot.

“It’s not often you see a technology company making their own wine and then giving it away through an online raffle.”

— Sheng Yang, CTO

“When we started to strategize about the new design for Logik.com, we knew we wanted a site that matched our focused work style and relaxed personalities. We aren’t like everyone else, and our new website clearly shows that,” said Andy Wilson, co-founder and CEO at Logik. “As our customers know, we maintain our sense of humor while working exceedingly hard for them, and we wanted our marketing to represent us in a full light. We chose vibrant imagery on the site and decided to showcase our new mascot, the Logikbot.” 

The site also has an interesting, original promotion for an eDiscovery company: visitors can register to win a free case of Logik “Redaction” red wine.  It’s not often you see a technology company making their own wine and then giving it away through an online raffle. “Andy and I are big fans of wine,” said Sheng Yang, Logik co-founder and CTO. “With the help of Crushpad, a wine-making business in San Francisco, we created our own wine for our employees and valued clients. We’ve already sampled it, and I have to say, we think people are going to really enjoy the rich and spicy flavor. Redaction will be ready to drink by early 2010, so we encourage our visitors to register to win a case and taste the fruits of our labor on Logik.com.”

Front and center on the new Logik.com website is a section entirely dedicated to Logik’s transparent eDiscovery pricing model, offering a detailed description of the Logik pricing model and what client’s can expect to get for their money.  “Most pricing models in the eDiscovery processing business are about as consistent and easy to understand as the congressional budget,” said Andy. “Not knowing what you are going to pay until you get the bill is extremely annoying—and makes it hard to plan your annual budget. This is why we’ve designed our innovative eDiscovery technology to allow for 100% predictable processing costs. This means our clients always know how much they are going to spend for eDiscovery processing services before they send us the data. It sounds like a no-brainer, but it is a very rare pricing model in our industry and one our clients love.”

“We are a relatively young technology company—just five years—and we have accumulated a lot of relevant industry knowledge over the years,” said Andy. “We designed Logik.com to enable us to easily share our experiences. We hope people will learn some useful things on the website while getting a kick out of our take on eDiscovery marketing. And we think that’s a great combination.”

Discover the newly designed website at http://www.logik.com.

Logik.com is designed with the best Web standards in mind and should appeal to Logik’s client base as well as the general public. To make sure you have the best experience possible, please ensure you have Javascript enabled in your browser.

About Logik:

Logik (formerly known as Logik Systems) is an eDiscovery processing company located in Washington, DC. Logik helps corporations, law firms, government agencies and service providers simplify electronic data sought in discovery requests. Logik’s highly distributed processing platform, Gridlogik, was developed to process all kinds of unstructured and structured data sets such as email databases, spreadsheets, images and MS Office documents. Combined with their transparent pricing model, Logik offers customers the smart way to discover accurate results and make sense of processing costs. Find out more at logik.com.

Media interested in setting up an interview with a representative from Logik should email .(JavaScript must be enabled to view this email address) or call (800) 951-5507.

###

Comments

Post A Comment

Logik — Japanese Data Tsunami

Japanese Data Tsunami Posted By Adam Reilly on July 24, 2009

Just the facts please

  • Four terabytes of Japanese data
  • English and Japanese search terms
  • 14,000,000 pages for review
  • 8,000,000 pages produced to ITC
  • $500,000 in savings
  • < 2 months to complete
  • Happy client, happy customer

Challenge:

One of the world’s largest producers of wind turbines needed to collect, process, analyze, review and produce over four terabytes of “real” data (email and office files) to the ITC in a matter of months. What they had was a wave of data full of different encodings, email formats (Lotus, Outlook, EML, text-based), and Japanese proprietary document formats. They clearly needed help. Our client, one of the top three IP law firms on the planet, was tasked with managing this complex process from beginning to end. The data was collected in Japan and the US from over 100 people. Due to the volume of data, keywords in both English and Japanese (multiple encodings) were approved and needed to be applied to the large data set, post processing of course—a huge effort that needed help. Our client came to Logik to get the work done quickly and accurately.

So, what’s the problem?

  • Four terabytes of emails and office files = tens of millions of documents pre-search
  • Japanese documents have multiple encodings, making search and detection extremely difficult – plus the words need to be “tokenized” for accurate search
  • ITC has tight deadlines and expects perfect productions without error
  • Choosing a vendor that uses “extracted size” billing would double or triple the cost
  • So…which documents are English and which are Japanese, Chinese, or Korean again?

What we did:

Great project management is needed for a project of this size and scope. The first thing we did was assemble a team to work directly with our law firm client and the upper-management from the customer to devise a realistic schedule. Normally, four terabytes of data trickles in as the data is collected over time – we were able to get all of the data delivered within a month’s time. The schedule we created allowed us to provide massive rolling deliveries of data (hundreds of thousands of documents), meaning the client was never without documents to review (always a good thing).

The results:

  • Using language detection, we were able to flag all non-English documents with their respective language (e.g. Japanese, Chinese, Korean, etc.), thus facilitating a more efficient document review
  • We delivered ~14,000,000 pages, post search, in native + TIFF format to our client for review
  • Over 8,000,000 pages were flagged as responsive, numbered, endorsed and provided to the ITC
  • Production of the 8,000,000 pages took less than 24 hours for us to complete, ready to be delivered to the ITC
  • All data was processed, searched and delivered in under 2 months on a rolling delivery schedule, easily making the tight ITC deadline
  • Against other bids for this project, we saved the client over $500,000 in processing fees

More cases

Comments

Post A Comment

Logik — Redaction Terms

Redaction Terms Posted By Adam Reilly on July 24, 2009

image

lorem ipsum winesum

By participating in the Logik Redaction giveaway, entrants agree that Logik and their designees and assignees and all of their respective officers, directors, employees, representatives and agents shall have no liability and entrant will indemnify, defend and hold Logik harmless from any liability, loss, injury, or damage to entrants themselves or any other person or entity, including personal injury, death or damage to personal or real property, to entrant or any other person or entity due in whole or in part, directly or indirectly, by reason of the acceptance, possession, use or misuse of the prize or participation in this Giveaway (wow, that’s a long sentence).  Entrants further acknowledge that said parties have neither made nor are in any manner responsible or liable for any warranty, representation or guarantee express or implied, in fact or in law, relative to the prize, including, but not limited to, its quality or fitness for a particular purpose.  Logik is based in the US, so entrants must be 21 years of age or older, sorry kids.  Entering the giveaway multiple times does not improve your chances, sorry winos. Entrants must submit a valid email address that associates them with the eDiscovery industry.

Enter me

Comments

Post A Comment

Logik — Redaction

Redaction Posted By Adam Reilly on July 24, 2009

image

image

What is Logik Redaction?

Logik Redaction is a California red zinfandel created by Andy and Sheng for both our employees and our valued clients. Back in 2007 we decided that it would be a fun idea to create our own wine, so we hooked up with the amazing people at Crushpad in San Francisco and reserved a French Oak barrel to make our wine in.

Naturally, we named it Logik Redaction:

  • Red = the sweet color of the wine (and in our logo)
  •  
  • + Action = like Gridlogik™, the wine is designed to taste great and pack a punch (14.5% alcohol)
  •  
  • = Redaction = A common term in the eDiscovery industry (to revise or edit)

Crushpad has a huge wine making facility right on the bay in downtown San Fran. They provided the tools and expertise and we gave them the instructions on what we wanted to make; a really tasty, fruity and fun red Zin. The people at Crushpad get everyone involved, you can come down to help in the crushing, sipping, and bottling.  It’s a lot of fun.  If you are interested in making your own barrel please let them know Logik sent you (we get a free case of wine for all referrals!).

Logik Redaction Giveaway

We’ve been known to enjoy a nice glass of wine at the end of a long day. Actually, we like red wine so much, we decided to make our own! We want to share the goodness with you, and are giving away one bottle of our coveted and custom-made red Zinfandel every month.

Enter to win a bottle of Logik Redaction, our very own red Zin.

Comments

Post A Comment

Logik — Now that’s Logik tuff!

Now that’s Logik tuff! Posted By Andy Wilson on July 23, 2009

It’s a bird, a plane, no…it’s a Pelican case?  Have you ever wondered what would happen to a hard drive if you threw it out a four-story building?  Odds are the hard drive would smash into a million little pieces that even the best forensic examiner couldn’t piece back together.  BUT, what if you put that hard drive inside a plastic box, surrounded by impact foam?

Well, we wanted to find out if our Pelican cases were tuff enough to withstand the impact.  We placed one of our Logik hard drives within one of our custom designed Pelican cases and started the launch sequence, 5, 4, 3, 2,...1 launch!

The team on the ground secured the dropped case and hooked up the drive.  Sure enough, the case and hard drive were intact and still working, no shattered little pieces.  So, what does all this prove exactly?  We aren’t entirely sure, but one thing is certain, if you happen to drop one of our cases from your corner office the hard drive “should” be safe.

Be careful out there and watch out for falling Logik Pelican cases.

PS: We do eDiscovery better than we do film-making. But then, that’s why we do eDiscovery.

Comments

Post A Comment

Logik — Beer for eDiscovery

Beer for eDiscovery Posted By Adam Reilly on July 23, 2009

Just the facts please

  • 500GB of PSTs
  • 70% cull rate
  • 900,000 documents reviewed
  • 1,000,000 pages produced
  • Less than 2 weeks to process
  • Less than 1 week to produce
  • Happy, happy client

Challenge:

Beer and eDiscovery go together like hops and barley.  Our law firm client had a large, very well known beverage company as a customer who was in the middle of a massive merger with another frosty beverage (beer) manufacturer. Then the DOJ handed them a rather large second request. Although these requests are extremely time-sensitive, clients can’t sacrifice quality over speed. This presents a rather difficult challenge for any company, especially if you are already over-budget on merger expenses.

The data in the request, about 500GB of Microsoft Outlook PSTs, was collected by the client. Since the client didn’t have enough time to limit the amount of redundancy while collecting the PSTs, duplicate emails and attachments slipped through the cracks, increasing the volume of data. In order to facilitate a speedy review, these duplicate documents needed to be pulled before review started.

Prior to the second request, the law firm had already contracted with an outside Attenex® provider for the processing, review and production. But their plate was already full and taking on another 500GB of email, which would very likely be hundreds of thousands if not millions of additional records, risked missing the DOJ deadline.

Although the 500GB of email needed to be loaded into Attenex, which provides a very fast way to review large sets of documents, our client turned to Logik for a solution to their time-critical issue.

So, what’s the problem:

  • 500GB of email = approximately 3,000,000 documents before culling
  • 30 days to review and produce to DOJ
  • Large volume of duplicate documents
  • Chosen vendor was already overloaded


What we did:

Logik has worked on fast-paced second requests before, so the incredibly tight turnaround wasn’t new to us, but the added Attenex element was an interesting twist we had little experience with. The client wanted us to process and reduce the data with Gridlogik™ and send only the unique parent emails in MSG format to the Attenex vendor. Ok, no problem. Then, our client requested on-going horizontal de-duplication (across the entire data set) to further reduce the data. Ok, no problem. Then they asked us if we could handle the TIFF productions to the DOJ, assuming we could match up the Attenex records with the Logik records. Again, no problem.

Gridlogik is excellent with record keeping (every single document Gridlogik processes is tracked with a unique Logik ID). This made it very easy for the Attenex provider to send us exported Attenex XML files after a batch of documents was ready for production. We took the Attenex XML and easily pared the exported records with the Logik records. The matching process was fast and successful.

The matched records were flagged, formatted and converted to TIFF for production to the DOJ. Since we setup the Concordance database according to strict DOJ specs, the client’s quality control process and subsequent production approval was quick and painless, and they easily met their deadline. That makes everyone happy.

The results:

  • The 500GB of MS Outlook PSTs reduced by 70% to 900,000 unique documents
  • The responsive documents produced 1,000,000 pages produced to the DOJ
  • Although received in batches, the entire data was processed in less than two weeks
  • After matching up all the Attenex/Logik records, production to DOJ took less than one week
  • The client made the tight deadline with room to spare
  • Although already over-budget, we kept the costs low with our predictable pricing
  • Tasty frosty beverages continue to be served around the world



It was tempting to ask for payment in a lifetime supply of quality beer for managing such a fast-paced and complex DOJ second request, but we chose the more conventional route and went with a check. Yes, it may seem more boring, but you can bet that check went to good use for the entire team. Cheers!

More cases

Comments

Post A Comment

Logik — didyouknow_101

didyouknow_101 Posted By Andy Wilson on July 22, 2009

That by reading through all of these “Did You Knows” qualifies you as an eDiscovery ninja?

Comments

Post A Comment

Logik — didyouknow_100

didyouknow_100 Posted By Andy Wilson on July 22, 2009

That it will take a team of 10 reviewers ~500 days to review 10,000,000 documents, assuming 2,000 documents/reviewer/day?

Comments

Post A Comment

Logik — didyouknow_99

didyouknow_99 Posted By Andy Wilson on July 22, 2009

That you can use a mapped drive letter (e.g. X:\) to gain access to a Windows file that has accidentally gone over the 256 character limit?

Comments

Post A Comment

Logik — didyouknow_97

didyouknow_97 Posted By Andy Wilson on July 22, 2009

That early case assessment (ECA) is a buzzword that means a myriad of different things depending upon who you are talking to?

Comments

Post A Comment

Logik — didyouknow_98

didyouknow_98 Posted By Andy Wilson on July 22, 2009

That Lotus Notes (in comparison to Microsoft Outlook) emails usually contain a very high number of embedded images in the body text of the email, like desktop screen-shots?

Comments

Post A Comment

Logik — didyouknow_95

didyouknow_95 Posted By Andy Wilson on July 22, 2009

That Microsoft Exchange (.edb) databases can be easily opened by a variety of software products?

Comments

Post A Comment

Logik — didyouknow_94

didyouknow_94 Posted By Andy Wilson on July 22, 2009

That AutoCad documents should be viewed in native, not TIFF, format because of their 3-dimensional layouts?

Comments

Post A Comment

Logik — didyouknow_96

didyouknow_96 Posted By Andy Wilson on July 22, 2009

That removing near duplicate documents without first reviewing them could risk missing important information?

Comments

Post A Comment

Logik — didyouknow_92

didyouknow_92 Posted By Andy Wilson on July 22, 2009

That efficient and timely pre-trial eDiscovery is a huge strategic advantage in litigation?

Comments

Post A Comment

Logik — didyouknow_91

didyouknow_91 Posted By Andy Wilson on July 22, 2009

That MAPI = Messaging Application Programming Interface, and it allows access to email content and metadata?

Comments

Post A Comment

Logik — didyouknow_93

didyouknow_93 Posted By Andy Wilson on July 22, 2009

That the “All Documents” view in Lotus Notes doesn’t always reveal ALL the documents, because it is a query and can be modified?

Comments

Post A Comment

Logik — didyouknow_89

didyouknow_89 Posted By Andy Wilson on July 22, 2009

That if you redact a document, you should re-OCR the document before producing the text of that document?

Comments

Post A Comment

Logik — didyouknow_88

didyouknow_88 Posted By Andy Wilson on July 22, 2009

That a journalist at the New York Times OCRd 4 terabytes of TIFF images in under 24 hours with the use of Amazon’s EC2 cloud services?

Comments

Post A Comment

Logik — didyouknow_90

didyouknow_90 Posted By Andy Wilson on July 22, 2009

That instant messages are discoverable information and are slowly taking over email as the dominant form of business communication?

Comments

Post A Comment

Logik — didyouknow_86

didyouknow_86 Posted By Andy Wilson on July 22, 2009

That printing electronic files to paper is, in many cases, totally unnecessary and wasteful?

Comments

Post A Comment

Logik — didyouknow_85

didyouknow_85 Posted By Andy Wilson on July 22, 2009

That Adobe Photoshop files contain multiple layers of information, most of which are hidden from view and cannot be seen without the use of Photoshop?

Comments

Post A Comment

Logik — didyouknow_87

didyouknow_87 Posted By Andy Wilson on July 22, 2009

That not all OCR software is created equal and that many don’t work very well?

Comments

Post A Comment

Logik — didyouknow_84

didyouknow_84 Posted By Andy Wilson on July 22, 2009

That there is no realistic way to redact native files without first converting the file to an image?

Comments

Post A Comment

Logik — didyouknow_83

didyouknow_83 Posted By Andy Wilson on July 22, 2009

That transporting your sensitive evidence in an unsafe container, like a cardboard box, is ok until that box is dropped on the floor or lands in a puddle?

Comments

Post A Comment

Logik — didyouknow_81

didyouknow_81 Posted By Andy Wilson on July 22, 2009

That converting documents to TIFF might actually save you more time and money depending on your case?

Comments

Post A Comment

Logik — didyouknow_82

didyouknow_82 Posted By Andy Wilson on July 22, 2009

That MS Excel documents can have charts layered on top of each other, hiding potentially relevant data?

Comments

Post A Comment

Logik — didyouknow_79

didyouknow_79 Posted By Andy Wilson on July 22, 2009

That you can significantly cull large email collections just by isolating the domain name (e.g. @ebay.com)?

Comments

Post A Comment

Logik — didyouknow_78

didyouknow_78 Posted By Andy Wilson on July 22, 2009

That many enterprise search applications don’t extract embedded files?

Comments

Post A Comment

Logik — didyouknow_80

didyouknow_80 Posted By Andy Wilson on July 22, 2009

That keyword searching is more effective if you talk to the person who created the data before confirming the keywords?

Comments

Post A Comment

Logik — didyouknow_77

didyouknow_77 Posted By Andy Wilson on July 22, 2009

That Google just started performing OCR on PDF documents to make them Google-searchable in late 2008?

Comments

Post A Comment

Logik — didyouknow_75

didyouknow_75 Posted By Andy Wilson on July 22, 2009

That focusing on what NOT to collect can dramatically reduce your discovery costs?

Comments

Post A Comment

Logik — didyouknow_76

didyouknow_76 Posted By Andy Wilson on July 22, 2009

That most near-dupe technologies can not group foreign language documents together?

Comments

Post A Comment

Logik — didyouknow_74

didyouknow_74 Posted By Andy Wilson on July 22, 2009

That 7-zip compression software has a better compression ratio than WinRAR or WinZIP?

Comments

Post A Comment

Logik — didyouknow_73

didyouknow_73 Posted By Andy Wilson on July 22, 2009

That iPods, iPhones, and Blackberry devices can contain discoverable information?

Comments

Post A Comment

Logik — didyouknow_72

didyouknow_72 Posted By Andy Wilson on July 22, 2009

That MS Excel 2007 supports over 1 million rows of data?

Comments

Post A Comment

Logik — didyouknow_70

didyouknow_70 Posted By Andy Wilson on July 22, 2009

That copying 5GB of tiny files is much slower than copying 1 large 5GB file?

Comments

Post A Comment

Logik — didyouknow_69

didyouknow_69 Posted By Andy Wilson on July 22, 2009

That a 100MB text file will print over 100,000 pages or more if printed?

Comments

Post A Comment

Logik — didyouknow_71

didyouknow_71 Posted By Andy Wilson on July 22, 2009

That USB 3.0 is coming in 2009 and is 10 times faster than the current USB 2.0?

Comments

Post A Comment

Logik — didyouknow_67

didyouknow_67 Posted By Andy Wilson on July 22, 2009

That Microsoft Word 97-2002 documents can contain deleted data hidden within the binary of the file if “allow fast saves” are enabled?

Comments

Post A Comment

Categories

Sep 2010

S M T W T F S
     1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30    

Sign me up for Logik news!