Data

Data and information are not the same.


Data vs Information.

Data and information are not the same. Data is processed to become information and (just to be totally confusing) information is analysed to become data.

This is data.

11282019

Data can be in any form, numbers, text, images, boolean values.

There is an equation that shows the connection between data and information.

Information = data + structure + context

Take 11282019, if we add some structure we get 11/28/2019 and if we add the American context we get 28th November 2019.

02072020 is ambiguous as it could be 2nd July 2020 in Europe or 7th February in the USA.

Comparison Chart:

BASIS FOR COMPARISON DATA INFORMATION
Meaning Data is unrefined facts and figures and utilized as input for the computer system. Information is the output of processed data.
Characteristics Data is a individual unit which contains raw material and doesn't carry any meaning. Information is the product and group of data which collectively carry a logical meaning.
Dependence It doesn't depend on Information. It relies on Data.
Peculiarity Vague Specific.
Measuring Unit Measured in bits and bytes. Measured in meaningful units like time, quantity, etc.

Data is raw, unanalyzed, unorganised, unrelated, uninterrupted material which is used to derive information, after analysis On the other hand, Information is perceivable, interpreted as a message in a particular manner, which provides meaning to data.

Data doesn’t interpret anything as it is a meaningless entity, while information is meaningful and relevant as well. Data and Information are different common terms which we frequently use, although there is a general interchangeability between these terms. So, our primary goal is to clarify the essential difference between Data and Information.

Definition of Data :

Data is distinguishable information that is arranged in a particular format. Data word stems from a singular Latin word, Datum; its original meaning is “something given”. We have been using this word since 1600’s, and nowadays data is the customary term even though it is the plural of datum.

The examiner likes the definition "Raw facts and figures before they have been processed."

Data can adopt multiple forms like numbers, letters, set of characters, image, graphic, etc. If we talk about Computers, data is represented in 0’s and 1’s patterns which can be interpreted to represent a value or fact. Measuring units of data are Bit, Nibble, Byte, kB (kilobytes), MB (Megabytes), GB (Gigabytes), TB (Terabytes), PT (Petabyte), EB (Exabyte), ZB (Zettabytes), YT (Yottabytes)

To store data, earlier punched cards were used, which were then replaced by magnetic tapes and hard disks and finally solid state devices.

There are two variants of data, Qualitative and Quantitative.

  • Qualitative Data emerges when the categories present in data are distinctly separated under an observation and expressed through natural language.
  • Quantitative Data is the numerical quantification which includes the counts and measurements and can be expressed in terms of numbers.

Data deteriorates as time passes.

Definition of Information :

Information is what you get after processing data. Data and facts can be analysed or used as an effort to gain knowledge and infer on a conclusion. In other words, accurate, systematize, understandable, relevant, and timely data is Information.

Information is an older word that we have been using since 1300’s and have a French and English origin. It is derived from the verb “informare” which means to inform and inform is interpreted as to form and develop an idea.

This can be developed further.

Data Vs. Information

Parameters Data Information
Description Qualitative Or Quantitative Variables which helps to develop ideas or conclusions. It is a group of data which carries news and meaning.
Etymology Data comes from a Latin word, datum, which means "To give something." Over a time "data" has become the plural of datum. Information word has old French and middle English origins. It has referred to the "act of informing.". It is mostly used for education or other known communication.
Format Data is in the form of numbers, letters, or a set of characters. Ideas and inferences
Represented in It can be structured, tabular data, graph, data tree, etc. Language, ideas, and thoughts based on the given data.
Meaning Data does not have any specific purpose. It carries meaning that has been assigned by interpreting data.
Interrelation Information that is collected Information that is processed.
Feature Data is a single unit and is raw. It alone doesn't have any meaning. Information is the product and group of data which jointly carry a logical meaning.
Dependence It never depends on Information It depended on Data.
Measuring unit Measured in bits and bytes. Measured in meaningful units like time, quantity, etc.
Support for Decision making It can't be used for decision making It is widely used for decision making.
Contains Unprocessed raw factors Processed in a meaningful way
Knowledge level It is low-level knowledge. It is the second level of knowledge.
Characteristic Data is the property of an organization and is not available for sale to the public. Information is available for sale to the public.
Dependency Data depends upon the sources for collecting data. Information depends upon data.
Example Ticket sales on a band on tour. Sales report by region and venue. It gives information which venue is profitable for that business.
Significance Data alone has no significance. Information is significant by itself.
Meaning Data is based on records and observations and, which are stored in computers or remembered by a person. Information is considered more reliable than data. It helps the researcher to conduct a proper analysis.
Usefulness The data collected by the researcher, may or may not be useful. Information is useful and valuable as it is readily available to the researcher for use.
Dependency Data is never designed to the specific need of the user. Information is always specific to the requirements and expectations because all the irrelevant facts and figures are removed, during the transformation process.

DIKW (Data Information Knowledge Wisdom)

DIKW is the model used for discussion of data, information, knowledge, wisdom and their interrelationships. It represents structural or functional relationships between data, information, knowledge, and wisdom.

Example:

Conclusion

  • Data is a raw and unorganised fact that required to be processed to make it meaningful.
  • Information is a set of data which is processed in a meaningful way according to the given requirement.
  • Data comes from a Latin word, datum, which means "To give something."
  • Information word has old French and middle English origins. It has referred to the "act of informing.".
  • Data is in the form of numbers, letters, or a set of characters.
  • Information is mainly in the form of Ideas and inferences.
  • DIKW is the model used for discussion of data, information, knowledge, wisdom and their interrelationships

Data types

Data type Description Example of data How it could be used
Text Any character Db7&-?hT5 To store the names of items, events or people. Phone numbers are usually stored as text as this means that they can have spaces and the leading 0 is maintained. Calculations are usually not carried out on text values.
Alphanumeric Any combination of letters, symbols, spaces or numbers AjcY6*9eX3 To store postcodes as these contain a combination of letters and numbers.
Numeric types
Integer Whole numbers 1960 To store numbers such as items of stock, number of lengths swum, number of tickets sold in one day for a live concert, TV channel numbers or years.
Real Any number, with or without decimal places 12.30 To store height, weight, length, scientific values.
Currency Shows the data in the format of money; it can be used to show currency signs and has the decimal digits to show the full currency details. £79.56 To store prices.
Percentage A number format that includes decimal places and a % sign. 25% To show the percentage of a discount, such 10% off the price.
Fraction A number format (usually included in spreadsheet software) that enables the two parts of a fraction to be input and manipulated.

½

To show the result of a calculation expressed as a fraction.
Decimal A number format that shows an exact number using the decimal point and the numbers after the decimal point. 22.75 To show the result of a calculation expressed as a decimal.
 
Date / Time A date or time - they are different formats of date and time that can be used;which one is selected will depend on how the date/time is to be stored and processed.

25/04/2019

19:15

To show a date or a time.
Limited choice Restricts the choice by a user to one of a set of mutually exclusive choices; usually used in an information-gathering document. A drop down list, radio buttons or a tick list. To select a day of the week or to select a payment method.
Object An additional component usually found in a spreadsheet or SQL database; can be called a "BLOB" (Binary large object) such as a sound file or video. A chart or graph taken from a different source To insert a chart into a worksheet that has been taken from a different file.
Logical / Boolean There are only two choices; true or false.

Yes or no; true or false, 1 or 0.

[Interestingly, the text book gives Male or female as an example but given that the choice is now mflgbtqqip2saa it is not exactly choose 1 from 2.]

To store the answer of a closed question.

Activity

Copy and complete this table. in each cell give an example of the data type being used effectively in the situation. For example a shop will use a real number to show the weight of a product.

  School Sports Club Shop
Real number      
Integer      
Boolean      
Date/time      

How can information be described?

The meaning of data relies on the data itself, its structure and its context.

For example

Data Structure Context Meaning
01012019 dd/mm/yyy A UK date New year's Day 2019.
30 40 50 60 70 Integer numbers Miles per hour UK speed limits on roads and motorways.
TRNB14

LL/LL/NN

First 2 letters: type of clothing

Second two letters: colour

Last two numbers: UK size

A clothing shop stock code A navy blue pair of size 14 trousers.

Activity

Copy and complete the following table. Fill in the table to show how data can be turned into information. The first one has been completed for you.

  Shoe shop Police Station Hospital
Data MTNSSS10    
Structure LLLLLNN    
Context A shoe shop stock code    
Meaning Men's trainers, non-slip soles, size 10    

Questions

  1. Describe the difference between data and information [2 marks]
  2. Describe one characteristic of data [2 marks]
  3. Using an example of a stock number NBLT16, show how this data can be turned into information. [4 marks]

 

Methods of collecting data and information

 

Further study

Questionnaires and online surveys. The examiner identifies 4 types of question, Open, Closed, Rank Order and Rating. There is also Multiple Choice. Writing questionnaires is quite easy, but writing an effective questionnaire is more difficult.

Question: Read this document from Harvard and write 5 key facts that you think are important about questionnaires.

Question: Create a 5 question survey about local music where each question is a different type.

Question: Create a help sheet that explains each of the 5 question types.

Question: Use this website to create a table of advantages and disadvantages of online surveys.

Email.

Question: Imagine that you are creating an email based survey for school, read this webpage and choose the most appropriate and least appropriate tip for your school survey. Explain your choices.

Sensors. Devices that can detect inputs from the environment

Question: Use this website and this website and this website to make a help sheet about sensors.

Question: Research the "IoT" or the "IoE". Make an A4 poster from your research.

Interviews.

Question: Use this website to make a table comparing interviews with questionnaires.

Question: Use this website that considers interviews from a marketing perspective to list the pros and cons of interviews.

Consumer panels.

Question: Explain what is meant by the term "Consumer Panel". This website will help. This website, here may also help.

Loyalty schemes

Loyalty schemes are an unusual but common method of data collection as there is a benefit to the consumer provided in exchange for data that the consumer probably does not know is being collected about their habits and choices.

Question: Create a list of pros and cons using this website.

Statistical reports

Question: How often is the official census taken and when will the next one take place? The last census report for Fenland may hold a clue.

Secondary research methods.

Activities

  1. Use Survey Monkey to create an online questionnaire to find out about leisure activities in your post code area. Use as many of the 5 types of questions as possible.
  2. Use Adobe PDF Writer to create a form that could be emailed to your classmates to find out about employment in your post code.
  3. Compare the "IoT with the "IoE" and decide which is best.
  4. Use Street Check to find out about your local postcode, in particular housing and employment. Present your information as a 3 slide PowerPoint.
  5. Create a table that shows which methods of collecting data can reduce human error when inputting data and those that do not.
  6. This second and third columns in this table have been jumbled up. Copy this table and re-arrange the advantages and disadvantages so they fit with the correct method
Method Advantages Disadvantages
Questionnaires / Surveys
  • The same email can be sent to many people at the same time.
  • The cost of consumer panel feedback can be low if online feedback methods are used.
  • If a trusted source is used, then the statistics are readily available, cover a range of topics and are reliable.
  • Sensors may stop working, for example if three is a power cut.
  • May not have been collected for the same purpose so may not provide clear and full data.
Emails
  • The data collected by a sensor is usually more accurate than that taken by people, for example people can lose count but sensors just keep on working.
  • Statistics can show trends and patterns that can help with decision making.
  • Can be time consuming and costly to carry out.
  • The format of the feedback needs to match the processing than needs to be carried out.
Sensors
  • Large numbers of people can be asked to fill in the same questionnaire/survey.
  • A rapport can develop between the interviewer and the interviewee that may result in the questions being answered honestly.
  • The data has already been collected and possibly processed.
  • It is not always possible to tell if the data is real/genuine.
  • Not suitable for gathering data and information from large numbers of people.
  • Statistics show data from a sample of people rather than a true representation.
  • The data may not be exactly what is required.
Interviews
  • The results from the emails can be automatically input into software for analysis/manipulation.
  • The data collected each time a customer uses their loyalty card can provide information on the habits of the customer.
  • Data collection is quicker than having to collect the data first-hand.
  • The positioning of sensors needs to be carefully considered as incorrect placing could result in worthless data being collected.
  • Some people feel that the data collected about them through a loyalty scheme can be an invasion of privacy.
Consumer panels
  • Once set up, do not need human intervention as the data collected can be sent electronically.
  • A loyalty scheme can keep customers using the business.
  • If the fields or data types are not exactly the same as the fields being used for the analysis/manipulation then the data collection may be worthless.
  • Statistics ned to be collected knowing how they are going to be analysed/processed and stored.
Loyalty schemes
  • Comparisons are easy to formulate (e.g. 75 percent of people liked the new company logo.)
  • The feedback provided is specific to the product or service.
  • Some processing may have already been carried out.
  • Emails may be diverted into spam/junk folders by the email provider.
  • If there isn't a range of people on the panel, the feedback could be biased towards one specific type of person.
Statistical reports
  • There is little risk of human error occurring when the data collected is input into the software.
  • Additional questions can be asked to clarify and answers already given.
  • Response rates are high as members of the panel have agreed to take part.
  • A badly designed question might not get the data required in the right format.
  • Poor interviewing can lead to misleading or insufficient data and information being gathered.
Secondary research methods
  • Cheaper than interviews for large numbers of people.
  • Questions can be modified based on the answers given to previous questions.
  • If the questionnaire/survey is online, people need the technology to complete it.
  • If products need to be provided to the panel, the cost may be high in terms of the actual product and the delivery to members of the panel.

 

Methods of processing data

Spreadsheets

Both

Databases

Spreadsheets are designed to store text and numeric data. Both are designed to store data and can store dfata in all formats. Spreadsheets have become financial tools and so are inclined to numeric operations. Databases are used to store and process data and text.
Graphs can be created to show the results of the processing of numerical data. Spreadsheets can manipulate text just as easily as a database if you know how.   Databases can draw graphs and can process numeric data - they are not specialists but it can be done.
    They allow the entry, storage, editing and processing of data. Data scientists call this CRUD - Create, Retrieve, Update, Delete.
The format of data can be set, for example as currency, to meet the defined objectives   This is also true for databases.
Functions and formulae can be used to calculate or recalculate results. In both cases, code can be used to find or change data. A query can be used to find specific records.
Modelling can be carried out.    
Worksheets can be used within a work book. A spreadsheet can make connections between worksheets and a database makes connections between tables - relationships. A database is stored in a table or tables. A table is a file that is made up of records. A record is a collection of fields. A field holds one item of data. An item of data is made up of characters.
Absolute and relative references can be used.    
  Validation and verification apply to both databases and spreadsheets. Validation can be set for different fields.

Questions

1. A team of researchers are studying urban wildlife, such as foxes, mice, and birds. The team collect data during the day and night. Sometimes, they work in an office.

(a) The team:

  • use a range of hardware, including laptops, tablets, and smartphones
  • use different operating systems and applications
  • communicate with each other using smartphones or tablets
  • store and share data, including images, audio recordings and videos
  • work collaboratively on research documents.

Identify the secondary storage medium most suitable for the team and justify why it best meets their needs. Write your answer in a Word document. [6 marks]

(b) The team analyse the data when they are in the office. They have to log on to the office network using a username and password to access the data. What social engineering exploit might be used to discover their passwords? [1 mark]

(c) The team use a software application and the data they have collected to create a model of urban wildlife in order to understand more about the environment wildlife live in. Examples of data collected:

  • number and type of animals seen each hour
  • number of newborn animals
  • daily weather conditions
  • type and amount of food available.

Describe one way the team could use the model to help them to research the urban wildlife environment. [2 marks]

2. Some research has shown that the use of technology promotes a sense of belonging. This type of research is usually supported by the use of online questionnaires.

(a) Describe one way in which the use of technology makes us a more inclusive society. [2 marks]

(b) The questionnaire asks for the number of smartphones a student can access. One student enters -38 (negative) in error. Explain how a databse could detect this error but could not correct it. [4 marks]

3. Stevie finds that the T-shirts that she bought online do not match the description on the website.

(a) Describe her consumer rights. [2 marks]

(b) Stevie wants to inform others about her experience. State one way that she could do this online. [1 mark]

(c) Many websites use online targeted marketing. Describe one method that companies use to target online marketing at individuals. [2 marks]

(d) Many people shop online. Discuss the benefits and drawbacks to the customer of shopping online. [6 marks]

(e) What information is collected by the online shopping site every time Stevie browses their site, whether she buys something or not? [5 marks]

The environmental impact.

snb contact details

If leaving a message is important ...

Phone

(+44) 1733-0000

Address

1234 Clive Sullivan Way,
Millward
Yorkshire
Great Britain