Saturday, September 26, 2009

Week 5 Readings

WEEK 5 READINGS
1) Data Compression-data compression or source coding is the process of encoding information using fewer bits (or other information-bearing units) than an unencoded representation would use, through use of specific encoding schemes.
-only works when both the sender and receiver of the information understand the encoding scheme
pros: -helps reduce the consumption of expensive resources, such as hard disk space or transmission bandwidth
cons:-compressed data must be decompressed to be used, and this extra processing may be detrimental to some applications
lossless vs. lossy compression-lossless is possible because most real-world data has statistical redundancy
-in lossy compression, some loss of fidelity will occur, but it's mostly guided by research on how people will perceive the data in question
-Lossless compression schemes are reversible so that the original data can be reconstructed, while lossy schemes accept some loss of data in order to achieve higher compression
-theoretical framework for compression is provided by information theory and rate-distortion theory
2) Data compression basics-data vs. information (I like the example that if someone sends you the same e-mail twice, you'll have two e-mails worth of data but only one e-mails worth of information)
-"The fundamental idea behind digital data compression is to take a given representation of information (a chunk of binary data) and replace it with a different representation (another chunk of binary data) that takes up less space (space here being measured in binary digits, better known as bits), and from which the original information can later be recovered."
-Run-length encoding: replaces "runs" (that is, sequences of identical characters) with a single character, followed by the "length of the run" (the number of characters in that sequence) in order to provide the same information in less space; if a file contained normal text such as a paragraph with few or no repetitions, RLE compression would not be useful
-The Lempel-Ziv compressor family: work by replacing redundant (i.e., repeated) source data with references to its previous appearance (LZ77) or by explicit references to a "dictionary" compiled from all the data in the source file (LZ78).
-Entropy coding: way to assign shorter codes to common data blocks, while assigning longer codes to rarer data blocks.

-The Wikipedia article provides a basic overview of data compression, and even though I had to read it a few times to really feel like I had the gist (I'm not the most technically inclined person to say the least) I thought it was written in such a way that it could be generally understood. The simple examples the article gave, such as the llustration of lossless and lossy compression through the use of the string 25.888888888 were very helpful in contributing to an understanding of how these types of compression compare and contrast. I felt like I got more out of the second article, because it went more in depth on what was covered in the Wikipedia article, and was written in far simpler language. This article also provided great examples and the pictures depicting differences in compression of JPG images also helped the ideas to sink in. At first, when I began reading these articles I sort of felt like I didn't see how they were useful to all of us. As I read further I began to see that's not the case at all. In terms of practical pplications within the library professions, as we continue to go more and more digital, I think many of us will need to familiarize ourselves with concepts such as image compression as we face issues such as providing images of print media and electronic content in ways that are efficient and cost effective for the library but still beneficial for the patrons. Moreover, much as we face challenges with physical space now in developing and maintaining collections, we will face similar challenges pertaining to digital space as we have more and more information to present and preserve electronically, and an understanding of data compression will likely be very necessary for anyone working in this profession.

3) Imaging Pittsburgh: Creating a shared gateway to digital image collections of the Pittsburgh region
This article is about Pitt's Digital Research Library and their reciept of a grant from the IMLS to provide online access to historical photo collections, containing over 7,000 images. The author included a summary of the project, challenges and accomplishments, and the project outcomes. It was really interesting to read about a practical application of the concepts we read about in the other articles, especially in light of the fact that it was a project right here at Pitt. I thought it tied in to the point I made earlier, that as we continue to work on increasing online access to materials, it will be more and more necessary to understand the nature of, as well as the possibilities and limitations of data compression. A project such as this is something that many of us could likely encounter as we enter into the library professions. It was also interesting how it tied into other concepts we've discussed recently, such as digitalization and metadata.

4) I couldn't find the 4th article, I kept receiving a 404 Error whenever I tried to access the link. I will try to search through other blogs to see if anyone found a working link to this reading.

1 comment:

  1. I think the link for the 4th article is broke but I just searched on google for the article and came up with this :

    http://www.lita.org/ala/mgrps/divs/acrl/publications/crlnews/2007/jun/youtube.cfm

    That's the article I read, so I hope it's the right one. It is a pretty good article that follow basically what you say about the Pitt Digital Research Library. Eventually we, as librarians, will probably be put in situation like putting a how to video on Youtube or enable online access to various resources.

    ReplyDelete