Tuesday, December 8, 2009

Comments- 12/8-12/15

This week I commented on Rachel's blog here:
http://knivesnmatches.blogspot.com/2009/12/readings-for-1215.html

and Veronica's blog here:

Muddiest Point (12-1)

I did not have a muddies point from last week's lecture.

Unit 14 Reading Notes (12/15)

Unit 14 Reading Notes
1) Galen Gruman. “What cloud computing really means” InfoWorld, April 2008.
-This article explores cloud computing, a somewhat recent trend in the IT world that is difficult to define precisely because, according to the article, many users define it differently. It points out that it can be defined narrowly as "virtual servers available over the Internet" or more broadly as "anything you consume outside the firewall."
-The article offers this definition of cloud comuting: Cloud computing encompasses any subscription-based or pay-per-use service that, in real time over the Internet, extends IT's existing capabilities, and then breaks it down into more specific uses of cloud computing
1) SaaS (software as a service)
-delivers a single application through the browser to thousands of customers using a multitenant architecture.
-for the customer, this means no upfront investment in servers or software licensing
-for the provider, there is just one app to maintain so costs are low compared to conventional hosting
2) Utility Computing
-not a new idea, but it is gaining new life from users such as Amazon.com who now offer storage and virtual servers that IT can access on demand
3) Web Services in the Cloud
-closely related to Saas; enable developers to exploit functionality rather than delivering full-blown applications
4) Platform as a Service
-Another Saas variaton; delivers development environments as a service; build your own applications that run on the provider's infrastructure and are delivered to your user's via the Internet from the provider's servers
-the article compares this to Legos: you are limited by the vendor's design and capabilities, so you don't get complete freedom, but you do get predictability and pre-integration
5) MPS (Managed Service Providers)
"One of the oldest forms of cloud computing, a managed service is basically an application exposed to IT rather than to end-users" The article uses e-mail spam filtering as an example
6) Service Commerce Platforms
A hybrid of SaaS and MSP, this cloud computing service offers a service hub that users interact with
7) Internet Integration
-still in its early days, integration of cloud-based services

This article did help give me a general idea of what cloud computing is all about. What I mostly took from it is that instead of relying on your own servers and complicated, expensive hosting and development, cloud computing offers many different services on a pay-as-you-go basis, allowing you to easily integrate what you need to serve your users without paying for excess that will not be used. It seems that this is much easier, effective and cost efficient for the customer to take advantage of provider's services in this way.

2) Youtube Video- Explaining Cloud Computing
This video offers a very clear and concise explanation for what Cloud Computing is, how it is associated with Web 2.0, and what it means for both users and providers. It explains what it entails, gives examples such as Google Docs that I myself have used but did not even realize that it was considered an example of Cloud Computing, and generally does a very thorough job of making a somewhat complex topic far more approachable.

3) Thomas Frey. The Future of Libraries: Beginning the Great Transformation
This article discusses the traditional role of the library as a storehouse to archive manuscripts, art and important documents- the foundational building blocks of information for all of humanity. It then goes on to discuss how the role of libraries is changing. I thought one of the most succint points in the article was this:

"We have transitioned from a time where information was scarce and precious to today where information is vast and readily available, and in many cases, free." The article goes on to point out that many people who once visited the library to access this scarce and precious material can typically find it online and often for free. So where does that leave libraries? Frey goes on to discuss ten key trends that are influencing the future of the library.
Trend #1 – Communication systems are continually changing the way people access information
Trend #2 – All technology ends. All technologies commonly used today will be replaced by something new.
Trend #3 – We haven’t yet reached the ultimate small particle for storage. But soon.
Trend #4 – Search Technology will become increasingly more complicated
Trend #5 – Time compression is changing the lifestyle of library patrons
Trend #6 – Over time we will be transitioning to a verbal society
Trend #7 – The demand for global information is growing exponentially
Trend #8 – The Stage is being set for a new era of Global Systems
Trend #9 – We are transitioning from a product-based economy to an experience based economy
Trend #10 – Libraries will transition from a center of information to a center of culture

Frey also offers his recommendations for libraries, regarding how we can continue to adapt and grow in order to remain relevant in this digital age. I especially was interested in Trend 10- libraries will transition from a center of information to a center of culture. I think this is very true in many ways. The information we can offer, or more specifically in a digital age, the means of accessing information we can offer, will always be a central part of libraries. But to me, every library has a unique personality, directly related to the community it serves. The idea of a library becoming a center of culture is not unheard of in my opinion. We see this in some of
our libraries in our own county- for instance the Braddock Carnegie Library which is seeking to serve its community in new and innovative ways. If the services we once provided are rendered unnecessary through technology, we must seek to provide other services- such as interaction with culture and community- that cannot necessarily be offered strictly through a computer alone.

Sunday, November 22, 2009

Assignment 6- Website

Here is the link to the website I made for assignment 6.

http://www.pitt.edu/~jac228

Saturday, November 14, 2009

Unit 10 Comments

I commented on Rachel Nash's blog here

and Rachel Cannon's blog here

Unit 9 Muddiest Point

I did not have a muddiest point from this week.

Unit 10 Reading Notes- Web Search and OAI Protocol

Unit 10 Reading Notes- Web Search and OAI Protocol
1) David Hawking , Web Search Engines: Part 1 and Part 2
Part 1
-Hawking focuses on the data processing "miracle" of search engines that sort through hundreds of millions of queries every day, by examining the problems that whole-of-web search engines face, and techniques available to solve these problems
-infrastructure: large search engines operate multiple, geographically distributed data center; services built up from clusters of commodity PCs, the types of which are dependent on various factors; total number of servers for the largest engines is estimated to be in the hundreds of thousands; clusters or individual servers can be dedicated to specialized functions (e.g. crawling, indexing, etc.); largescale replication is required to handle the necessary throughput
-crawling algorithms: crawler initializes queue with one or more "seed" URLs; a good seed URL will link to many high quality websites; crawling proceeds by making an HTTP request to fetch the page at the first URL in the queue, then scans content for links to other URLs and adds each one to the queue
-crawling algorithm must address the following issues:
1) speed
2) politeness
3) excluded content
4) duplicate content
5) continuous crawling
6) spam rejection
-crawlers are highly complex parallel systems communicating with millions of Web servers, and as such there are many issues involved with engineering a Web-scale crawler

Part 2
-reviews algorithms and data structures necessary to index 400 terabytes of text on the Web and deliver high-quality results
-indexing algorithms: search engines use inverted file to rapidly identify indexing terms, using two phases (scanning and inversion)
-real indexers: store additional information in the postings, such as term frequency or position; aspects of real indexers include
-scaling up
-term look up
-compression
-phrases
-anchor text
-link popularity score
-query-independent score

Part 2 also includes an outline of the techniques real search engines use to 'speed things up' given the vast amount of information they have to sort through to produce quality results quickly

This 2 part series of articles is extremely helpful in explaining the basics of how search engines function. I thought they were easy to read and actually pretty interesting.


2) Current developments and future trends for the OAI protocol for metadata harvesting
OAI- Open Archives Initiative
I didn't fully understand what the OAI was at first, but as I read more of the article, it began to become clearer. The article did provide a brief explanation of the OIA, including its mission to "provide a worldwide virtual library of language resources" through developing of community based standards for archiving. The examples provided of the Sheet Music Consortium and National Science Digital Library were very interesting- I have never heard of either of these previous to reading the article. The article gave an overview of the standards and objectives for searching OAI repositories, and the future work necessary to further improve this.

3) “The Deep Web: Surfacing Hidden Value”
"Searching on the Internet today can be compared to dragging a net across the surface of the ocean. While a great deal may be caught in the net, there is still a wealth of information that is deep, and therefore, missed. "
I thought this analogy to describe searching on the Internet was a really great one. It really highlights the challenges of searching on the Web, as well as the incredible content that's buried out there and can be accessed with the development of the right technologies. The list of findings that BrightWeb published on their study of the Deep Web is somewhat astonishing; it is hard to believe that that amazing quantity of information is available on the web and is currently largely unaccessible to the average searcher. The thorough explanations of how search engines function, and what sort of technology is necessary to access the Deep Web, was truly interesting and helped increase my knowledge a good deal. Also, I shared the opinion of some of my other classmates that the illustrations and graphs in this article really helped make the point sink in for me.