There are two sources for this Module's Case:
Tony Oulton has an interesting article about "management research for information" (Oulton, Tony. (1995) Management research for information. Library Management. Vol.16, Iss. 5; pg. 75-81 -- available through ProQuest); here's the abstract:
It is suggested that the conclusions of a research study, whether one's own or another's, have value as a source of information for decision making. Experienced managers and prompt students of management are reminded about various approaches to research. Research is defined, and an evaluation is made to determine its worth in terms of validity, reliability and generalizability. Various approaches to research, in particular quantitative and qualitative research, and the interpretation of findings are examined. The use of statistical inference, statistical expression of probability and nonparametric and multivariate statistics is explained.
Don't let the last part scare you yet; it's not all that technical.
A “Consumer's Guide” to the Business and Management Literature by Dr. John Kmetz of the University of Delaware (available at http://www.buec.udel.edu/kmetzj/PDF/Chapter1.pdf) is a very useful discussion of alternate kinds of sources of knowledge in the field of business.
When you've read through both of these sources, please compose a short (ca. 4-6 pages) paper on the general topic of believability in business research. The following questions are suggested as things to think about, not necessarily as point to be answered specifically in your paper:
What do each of these sources tell you about the basis for believing what you read in a research study? Do you believe them?
Are there any other insights into the question of believability that you've gained from the Background Readings? The General References listed at the top of the Background Readings page? Other sources? What are the insights?
We said earlier that this course was to cultivate in you twin capacities: to be practitioners capable of conducting research in your domain, and to be educated users and critics of the research of others. To what degree might these things require different skills? What might those differences be?
What's the current state of your thinking about believability? How can we ensure that we disseminate only truth in our own research? How can we be sure that we believe only the truth in others' research?
How important is truth, really, to the businesses and organizations with whom we work and on behalf of whom we conduct research?
When your paper is done, send it in to CourseNet.
A “Consumer's Guide” to the Business and Management
This chapter began years ago as a short “tip sheet” for students who were often puzzled
and frustrated by their ventures into the academic literature. These experiences were
usually motivated by course requirements from faculty in our MBA program, most of
whom genuinely believed that the rigor and quality of the research reported in the
academic journals exceeded that from any other source. Students were therefore not only
encouraged but required to become consumers of material from the academic journals as
well as other more general sources of information.
The literature of management has been very much a part of the broader
“information explosion” of recent decades. Consumers seeking to use this material must
cope with a bewildering variety of information. This variety not only takes the form of a
huge number of sources, but differences along many dimensions—specificity, readability,
and applicability, among others. The “quality” of many publications is perceived
differently by different users, and advice from these sources on what to believe and to do
is often confusing and contradictory.
This chapter has two purposes. The first is to provide a guide to interpreting and
using the management literature, to help one search through the material, categorize and
sort it expediently, and make one’s own decisions about the quality and utility of the
information found. If a consumer using information gathered and summarized by others in
the form of reports or reviews, these guidelines will help to formulate relevant questions to
ask of those who provide the original information. In that regard, the second purpose is to
provide a critical view of the management literature, to balance the arguments of the
“champions” of each category below.
The basic message of this chapter is quite simple: there is a lot of information out
there competing for attention, and there is a great deal of good material in that. At the
same time, every source of information has its limitations, and there is no single source
that either is without some drawbacks on one hand, or that can meet the information needs
of every consumer every time, on the other. Some means of making a decision about what
to use for different purposes is needed, and this chapter is intended to provide some help
toward that end.
Locating and Selecting What Is Needed
The ability to keep up with all the literature in even a specialized field is rapidly being
overwhelmed by the volume of material available. This is one of the key implications of
the information explosion. Fortunately, recent advances in electronic search and
information retrieval have at least made searching the literature much more efficient.
While electronic databases are not yet entirely comprehensive, almost all new material can
be located electronically (and a significant amount of older material is being added all the
time). Finding time to read it all is another matter, of course.
Databases are being created by individual libraries around the country, and most
now provide comprehensive lists of the holdings of these libraries. The University of
Delaware, for example, has created DelCat for its book holdings, similar to the electronic
catalogs at the majority of US libraries; Web links to the electronic catalogs of many other
libraries are available. There are extensive CD-ROM databases for journals and
periodicals, and these are of enormous value to the business researcher. The main
difference between the book and periodical databases is that abstracts or descriptions of the
contents of books are not provided, so that the user cannot make a preliminary evaluation
of the contents. This is particularly unfortunate in the case of the specialized books, which
often provide excellent summaries of particular areas of research; this is also changing.
Access to these resources at Delaware is available through campus terminals or an outside
modem at (302) 831-0100 (up to 33000 bps). Modem settings are No parity, 8 data bits,
and 1 stop bit (N-8-1). A Web browser is needed to use the Library databases, and for the
Web version of DelCat. Browsers may be obtained for free by University students from
the Smith Hall computing site or by downloading from the University technology support
The other type of information which is not always located through a database
search is from the proceedings of the professional societies. Unlike most professional
groups in the physical sciences and engineering, not all of the social sciences abstract and
keyword the majority of their papers, and so many do not appear in the databases. There
is a major difference between the types of materials covered in proceedings, depending on
the discipline. In the physical sciences, proceedings carry summaries of the most current
materials being presented at a professional conference; in the management literature, they
either carry developmental or preliminary work or papers not considered good enough to
submit to journals.
In addition, there are rapidly-growing numbers of Web sites which also provide
information directly from the source (for example, the International Standards
Organization and the World Trade Organization, both located in Geneva, Switzerland,
support excellent Web pages). Given the diversity of sites and flexibility of access and
operation, the Web has become a major gateway for information that fits into all of the
Types of Literature
At first glance, the management literature may appear to be much more homogeneous than
it really is. There are actually several different management literatures, and each of these
occupies a specialized niche. Thus, an important problem is how to find the appropriate
material for the consumer’s needs. To a considerable extent, this requires some idea of the
content carried in each of the different categories of material.
I will begin by categorizing the types of literature in the field, and then offer
suggestions on methods to locate, select, and interpret the types. The management
literature can be roughly divided into five groups:
1. “Popular press” and electronic documents
The “popular press” refers to general readership publications, including such well-known
names as The Economist, Business Week, Fortune, Inc., and many others. Also included
in this category are business newspapers such as The Wall Street Journal and Barron’s. A
lesser-known set of publications in this group comes from various government or serviceorganization
publishers, including publications such as Business America from the US
Department of Commerce, and materials from the Chamber of Commerce. Most are
distributed nationally, and many of them are also distributed electronically through the
World Wide Web.
The familiar trade publications found in nearly every industry
in the US and
elsewhere also fall into this category. These are valuable sources of information specific
to an industry
, and many carry articles on broader management practices similar to those
found in the national publications. These are increasingly being widely distributed through
electronic media, and have become much more accessible than in the recent past.
The most current “newsworthy” information on matters in the business world is
found in the popular press. If the researcher wants current information on the Pacific Rim
or Europe, it is most likely to be found in this category. The market niche these
publications target is the business or technical reader; they differ in orientation, in that the
mass-market publications aim for general information while the trade journals are highly
. Their values are for practical, hands-on, applicable information that
will be useful to their readers. “Reportage, not analysis” is the simplest characterization
of material carried in these outlets.
2. Practitioner books and compendia
The type of book found here is practitioner-oriented, such as Blown to Bits, Competing for
the Future, The One-Minute Manager, and countless others. They are found in
bookstores, newsstands, and airport terminals all over the world. Two examples of these
types of books are Goldratt’s (1986) The Goal, which has been both a commercial success
and has had wide impact on manufacturing management in the US and abroad; another is
Peters and Waterman’s (1982) In Search of Excellence, which has been a commercial
success and very influential, but has also been roundly criticized as being a poor
prescription for many firms (and equally poor at identifying “excellent”companies). These
books emphasize readability and applicability, and target the executive and professional
markets; indeed a large part of this category is made up of the “professional” press. “A
book for every occasion” might characterize material in these publications.
3. Practitioner journals
These are journals written primarily by academics and published through universities or
academic outlets, but with content oriented toward practitioners. Included in this group
are The Academy of Management Executive, and others such as the Harvard Business
Review, California Management Review, and Organizational Dynamics.
While the number of these journals is relatively small, they have a well-defined
niche, being written primarily for executives and management professionals. They also
serve as outlets for academics who want to write for executives and colleagues, but in
executives’ terms. The practitioner journals serve as a bridge between the popular press
and academic materials, in that many ideas from the theoretical world can be made
accessible to potential users. What appears in these journals is less likely to be as current
as the popular press, but it is still topical. Articles on new theoretical or other academic
developments are often covered; they provide some insight into the current thinking in
academia, but are not necessarily as concerned with theoretical developments or technical
issues as the academic literature. The values reflected in these journals are also those
common to both the practitioner and the theoretician—both want fresh information and
new ideas, but want some objectivity and closure as well. “Bridges, not new roads” are
characteristic of the material in these journals.
4. Academic books and compendia
Literature in this category comes from specialized academic publishers of books and
collections of materials, principally to provide outlets for research summaries, essays,
theoretical articles, and similar materials. (This may be changing—in recent years
competitive pressures have forced many o f these houses to offer more professional titles.)
Publishing houses such as Sage, JAI, Lawrence Erlbaum, and Ashgate are well-established
in this area. While these are oriented toward academics and researchers, these compendia
are generally made up of chapters and essays by well-established scholars in the field, who
provide an overview of the area and point out its strengths, along with its limitations and
shortcomings as well. Most of these summaries are based on journal publications, but are
not journal publications themselves.
Additionally, of course, there is the college textbook, by no means a homogeneous
type itself, but sharing a common orientation to the undergraduate or graduate student.
Texts are highly variable with respect to the depth of coverage of material and orientation
(theory versus practice), but they usually value clarity and are aimed at the mass market
for students. Texts therefore are valued for readability as well as for scholarly criteria, and
do not assume familiarity with the field. The purposes and values of these publications are
similar to those of the other three categories, respectively, for each type of publication. “A
book for every discipline” is the characteristic of these publications.
5. Academic journals
These are journals which publish academic theory and research. They carry articles
almost exclusively written and read by academics. Several kinds of articles are published,
but most are either theoretical papers or literature reviews, or the results of empirical
research itself. These are highly specialized, and assume that the reader is familiar with
previous research done in specialized fields on which articles are written. Most
professional organizations publish proceedings of their major meetings, and these include
material similar to the professional journals. Both frequently require expertise in
complicated statistical and mathematical procedures, and give much detail on the steps
taken in the research being reported. Because of these properties, their articles have highly
selective audiences, and usually do not report information in a way which lends itself to
In the academic world, where much of one’s career depends on “publish or perish”
criteria, there has been a tremendous expansion of these journals and outlets. The market
niche these journals occupy is almost exclusively the academic world, with very little
readership among practitioners. The purpose of the journals is to enable scholars to
communicate their theories and findings with each other; of equal importance is to enhance
the prestige of the contributing authors and their institutions. Their nominal values are
those of science and the scientific method. “Analysis, not reportage” is the characteristic
of these journals.
One other popular type of research has generated many questions from students over the
years, and this is the survey. While not a category in their own right, surveys are a
popular way to gather empirical data on many subjects, and are widely used to evaluate
many questions, both for academic research and for practical matters in industry
elsewhere. While they are relatively easy to do and are very flexible, those same
properties make it easy for them to be done poorly. Statistical significance is not so much
the issue with surveys, but two problems afflict many of them: (1) wording of survey
questions, and (2) the nature of the sampling and data collection.
A good survey always follows three guidelines: (1) it uses neutrally-worded
questions; (2) it uses a random (probability) sample; and (3) data are collected from all
sample members. The last part is often the hardest, and the most important—if the
researcher does not get all the members of the sample, a survey cannot be truly considered
The divergence between the categories of literature also reflects the diversity of
topics in the field. There is an abundance of interesting things to write about in business
and management, and each category of literature carries information which is “valid”
within its own sphere. However, readership studies and reports (Buckley, Ferris,
Bernardin, & Harvey, 1998; Byrne, 1990; Gopinath & Hoffman, 1995; Hambrick, 1994;
Thomas & Kilmann, 1994; Lorsch, 1979; Oviatt & Miller, 1989; Price, 1985; Kilmann,
Thomas, Slevin, Nath, & Jerrell, 1994) consistently show a nearly complete divergence
between those who read popular press versus academic journals (categories 1 and 2 vs. 4
and 5), and only limited integration of these materials through the practitioner journals in
category (3). This is even true in education (Miller, 1999), and is evidenced by the July
14, 2000 introduction of a bill by Delaware’s US Representative Michael Castle to require
scientific standards for education research. What is “current” or “useful” or “good” is
therefore by no means similar between categories, and there is every reason to expect that
what is considered valuable by those who read one category will not necessarily even be
known to other consumers.
However, from the perspective of what the different types of materials try to
accomplish, it should be borne in mind that each has its special contribution to make. The
popular press is invaluable for current information in a rapidly changing, time-driven
business environment, and not everything that business does lends itself to scientific
investigation. Those interested in underlying, deeper trends and phenomena must detach
themselves from the everyday din of business to look more carefully into data specifically
gathered to address those issues. Thus, the reality is that no single source or type of
literature can do everything that all consumers might need, and materials from the different
sources are not interchangeable. Each category of the business and management literature
has its strengths and weaknesses, and the next section will summarize these.
Strengths and Weaknesses of the Literature Types
1. The popular press. The major strengths of the popular press and Internet sources are
that they are current, keyed to the specialized interests of their readership, and highly
readable. More than any other category, quality of writing is important to most popular
publications, and the information in them is much more comprehensible, and therefore
useful, to readers. The major publications in the field have highly accurate, credible
reporting; this may not necessarily be true of all Web sources, however, since many of
them have advocacy roles as well as simply providing information.
The problem inherent in the popular press is the same as its strength—currency.
This means that much of the information in this category is incomplete and uneven, in the
sense that some aspects of an issue may be very well developed while other parts are only
fragmentary. Usually this is the nature of “news”—the story is only partially complete.
On a few occasions, this unevenness may reflect conscious or unconscious biases from
authors or editorial staff. Coupled with these limitations is the lack of scientific rigor—
measures and criteria, other than relatively standard financial and physical performance
data, are often not provided or well explained. In some cases, reports of outcomes are not
well substantiated. “Faddishness” is one of the inherent risks of reliance on popular
reporting—concepts of what is ideal, or even workable, may come and go very rapidly,
with little solid evidence of utility, acceptability, or economic payoff.
The World Wide Web is a particularly interesting case. The freedom and ease of
access inherent in the Web allow one to find anything from jumper settings on computer
hard disks to restoration of the American chestnut tree. It also allows hate groups and
terrorists to communicate freely at the other extreme. With its appetite for timely
information, the business community depends on the Web increasingly, not only for
conduct of its B2B transactions, but for more general information as well. Some of this,
from professional news sources and reporting organizations, is monitored and subject to
editorial controls which place a premium on accuracy; other information sources place a
premium on image. For one example, I have yet to read a clear description of the details
of the Procter & Gamble matrix organization on their website; for another, I have students
do country reports on European nations for one of my courses, and despite the fact that
unemployment in the European Union hovered around 11 percent for much of the midand
late 1990's, it was nearly impossible to find any member of the EU 15 nations which
reported unemployment higher than that average on their official national websites!
2. Practitioner books and compendia. The diversity of this field makes generalizations
about strengths and weaknesses very difficult. The popular-market books are of such
variable quality and diverse focus that they cannot really be summarized. The best advice
one can give is to read reviews of them, if available, and to maintain a healthy skepticism
about them. Many of these books are written from the point of view of one individual’s
beliefs and experiences, and no matter how heartfelt the author’s convictions that these
personal lessons represent generalizable truths, the validity of this view is hard to establish.
These books can be as subject to faddishness as the popular press, and are often the
primary vehicle for new fads. Caution is the watchword.
Having given that caution, it is only fair to note that many very high-quality books
and compendia have been generated within this category, particularly the professional
press. Anyone who has uttered the words, “core competence” has been influenced by
Competing for the Future (Hamel & Prahalad, 1994); A Future Perfect (Micklethwait &
Wooldridge, 2000) is one of the most lucid and balanced examinations of globalization to
be found anywhere; The Fifth Discipline (Senge, 1990) has arguably done more to make
managers aware of the complexities of organizations than anyone before in the field of
systems theory. Consumers can get major value from sources like these, but there are
many volumes in this group which are thinly disguised “pitches” for a new fad, many of
which can be damaging to company health; one is inclined to think of reengineering as a
case in point (Hammer & Champy, 1993).
3. Practitioner journals. The strengths of the practitioner journals are that they usually
are less subject to faddishness than the popular press, and that their articles undergo a
review process similar to the academic journals. These journals are also highly readable,
and do not presume expertise (or much interest) in methodology or theory. They deal
with current events, but often with more objectivity than the popular press.
The weaknesses of the practitioner journals are that they sometimes drift in the
direction of philosophy and generality, and that they have limited immediate application
potential for either practitioners or academic researchers. This lack of direct utility is a
function of the “bridge” role that these journals play: they typically try to avoid
involvement in short-term trends and industry
issues, but also avoid reporting
many of the theoretical and research-methodological questions of greatest interest to
academics. Partly because of this breadth of interests, they can sometimes fall prey to
faddishness—arguments long on philosophy and short on validity have found outlets here:
the end of nationality-based consumer preferences (Levitt, 1983), organizational
reengineering (Hammer, 1990), and the death of hierarchy through computers (Leavitt &
4. Academic books and compendia. Compendia and annual editions are very similar to
the academic journals in both their strengths and weaknesses. One of the major
advantages of many of these works is that they summarize and evaluate progress in whole
areas of academic research—often some of the best literature reviews of a field are to be
found in these publications. But as was noted above, the fact that these are not “journals”
means that many of these items slip through the databases, so that their availability is more
limited. Also, these materials are usually prepared from a research perspective, and thus
tend to be technical and presume familiarity with the research area. Because their
readership is academic, there is a tendency to present materials in academic terminology,
which may result in the reviews being far less useful to non-expert consumers than they
might otherwise be.
5. Academic journals. The strengths of the academic journals, grounded in the traditions
and value system surrounding academic research, are that they try to avoid the pitfalls of
popular literature through application of relatively more scientific rigor. While academic
research may not reflect current events, and may be very conservative in its willingness to
advocate a position (or even a firm conclusion, in many cases), the objectivity of the
research, the scrutiny of reviewers before an article is published, and the reliance on more
rigorous methods all reduce subjectivity, faddishness, and unclear thinking, especially in
the empirical research literature. The primary safeguard against such errors is the use of
quantitative measures and careful data collection, along with statistical analysis to analyze
and evaluate the data. The assessment of “meaning” is fundamentally shaped by what
empirical data say.
The weaknesses of the academic literature, as with other literatures, are the obverse
of the strengths. The social science model that predominates in management research is
far less mature than the model of the “hard” sciences. Thus, while the traditions and
values of the scientific approach are adhered to as far as the field will allow, much
improvement in the social science model in management research is needed before this
literature can be considered truly “scientific.”
An irony in this connection is that while many individual studies in the social
science tradition are quite good, the literature as a whole falls far short of achieving
scientific credibility. Generally speaking, the more closely the academic literature
approaches the methods and procedures of the “soft” social sciences, such as
communication, decision making, motivation, leadership, team building and group
behaviors, school psychology, and the like, the more unlikely it is that individual studies or
the literature as a whole have scientific validity (Meehl, 1967). This weakness will be
discussed in more detail below.
Suggestions for Interpreting Management Literature
Interpretation is the hardest part of the job in using the management literature. Each of the
different categories above has at least one unique strength not found in other categories,
and at least one major drawback or limitation which is not offset by any of the others. The
short answer to the question of what to accept or believe, then, is that it falls to the user.
At present, none of the categories provides completely reliable or generalizable
Two general points should be kept in mind when evaluating information from any
of these sources. The first concerns time and timeliness. In general, the more rigor
involved in publishing anything, the less timely it becomes. Thus, one is always faced
with a tradeoff between timeliness of information and something approaching more
“scientific” rigor. However, this is not a simple time-vs.-quality tradeoff; each category
contains examples of high and low quality relative to others within its group.
The second point is that there is an underlying economic motivation to all of these
types of publications. In the case of purely commercial publications, the motivation is
simple and direct—sell copies. In other cases, and particularly the academic literature, the
motivation is indirect. Journals and books may be used to build personal and institutional
prestige, which results in pay raises and the ability to attract good faculty and students. In
my first academic job at the University of Southern California, a wise senior colleague
once advised me that there are only two questions that really matter in academic
publication. If the publication is a book, the question is “Will it sell?;” if an article, the
question is “Will it impress?” I have never found reason to discount that wisdom, and I
would recommend that it be borne in mind by any consumer.
By this point, the reader may have come to the conclusion that it is impossible to
find anything that can be believed, and to trust nothing, no matter what the source. That is
far from the case. It is important to bear in mind that in nearly any kind of serious
literature the author or researcher is not intending to deceive. Even when errors are found,
they are usually the result of misinformation, miseducation, often with the best of
intentions, or simply an honest mistake.
Fortunately, the increasing ease of access to information provides opportunities to
cross-check and broadly evaluate material and ideas. In addition, we can formulate some
guidelines to help interpret these literatures, once we are aware of their relative strengths
and weaknesses. In fact, many of the limitations in one category often suggest some of
the precautionary steps we should take in another.
General Suggestions for Interpreting Information
MAGIC. One excellent general set of guidelines for the evaluation of information comes
from Abelson (1995), a source which is really intended for more technical academic
audiences but in my view has wide applicability in all categories. He refers to the
following five criteria which are summarized by the acronym “MAGIC:”
Magnitude (of effect or outcome)—“how large?,” “how often?,” “what percentage
of events?” “what are base rates?” are the kinds of questions we should ask to see
how large an effect is.
Articulation (of argument, including possible opposing positions)—is the research
story told well, and does it consider both sides of an argument in reasoned form?
Generality (breadth of applicability)—is the argument something that has wide
implications, or is it very specific
to a time or set of circumstances, and how is that
Interestingness ( the argument has the potential to influence someone, perhaps to
even change beliefs)—this is usually a matter of having compelling support for an
Credibility (whether the argument is methodologically or perhaps theoretically
sound)—have alternative arguments been confronted? Are the data “too good to
be true?” Is the conclusion based on a small difference of one out of 100 findings
or only on personal experience?
Although Abelson addresses an academic audience whose interests are in using
statistical information most effectively, his ideas strike me as having much applicability to
many questions we face in business, where cause and effect can never be that clear and
every story might have another side we have not yet considered. Without too much effort,
it is easy to apply MAGIC criteria to many kinds of questions.
Measurement. In any of the categories of literature (not just empirical studies and
surveys), “results,” “findings,” “payoffs,” and similar things are often reported. Thus,
another very useful question to ask is “How was that measured?” Often the details are not
supplied, or the “result” is not supported. Whether someone claims that there was a huge
benefit from a new information system or a huge unrecovered cost, it is always a good
idea to ask how that result was obtained. Measuring most things, even the seemingly
obvious ones like costs and returns, is often harder than it first seems. For that reason, it is
worth keeping in mind that most surveys are not cheap or simple, and are not undertaken
out of disinterest in the outcome.
This is an area where the academic research literature has a lot of value. Whether
we think a particular theory or idea is worth studying or not, we usually can tell how data
were obtained. It is common to publish the actual measures used, no matter what the
variables being studied. This is very unusual in the other categories of literature. Once
again, however, be wary of the claims of what those measures “show”—this is the
Advocacy and Generalizability. Any item in the literature must be evaluated carefully for
its generalizability. The primary question is the extent to which a finding can be
considered to be representative of a large group—does the result apply in general? In the
popular press, many single-case or single-company studies are reported, and the extent to
which the conclusions from one study might apply to a broader population of
organizations is always an open question.
A common characteristic of materials which are argued to be generalizable is the
advocacy of some position, whether intended or not. A problem or an opportunity may be
widespread; convincing people that it is also huge is much more likely to evoke action
than something widespread but small. Since the time and energy to do a study or an
article showing some effect is usually not trivial, there is a vested interest in having some
impact when the story comes out. This can easily lead to the kind of sensationalism that
Stossel (1997) calls “junk science.”
Multiple Sources. One of the benefits of having several different categories of material to
draw from is that they allow consumers to cross-check arguments. This may not always
be possible because some topics tend to be specific
to one or the other of the categories,
but over time a topic usually generates multiple papers and studies, and even within a
category some “triangulation” is possible. With the increasing ease of literature searches
through databases and the Web, finding multiple items is less of a problem than before.
For any kind of material, it is always useful to avoid extremes, either in sources we
select, or in criteria we use to accept material of a specific
type. Just because some
advocacy researchers are prone to overstatement in their zeal, it is not necessarily true that
advocacy of a particular position is unwarranted. Neither is it true that “objective”
researchers have no stake in the outcomes of their research. We might value experience
more than experiment, and therefore tend to disregard academic research in favor of
“stories from the trenches.” But it is hard to generalize from one experience to another,
and so we might do well to take a more dispassionate look at a subject if we can, and
academic research often meets that need very well.
Suggestions for Interpreting Academic Research Information
The academic literature is harder to evaluate because of its specialized nature and the
persistence of many of the limitations discussed earlier. In many cases, to be quite frank,
there is little reason for the research consumer to go into this material directly unless
needing some very specialized information in the field being investigated, and frequently
with the help of a knowledgeable assistant. Given the specialized nature of this body of
literature, it is necessary to go into some additional details on the problems inherent in it,
particularly the empirical research in the field.
Limitations of the Academic Literature. In the academic world, the professional
journals and materials are valued above all others. They are considered to provide the
most rigorous analysis of important questions and serve as communication media among
professionals. However, the primary weaknesses to guard against in the academic
literature are those inherent to the “soft” social sciences, which are weak sciences at best
(Meehl, 1986; Meehl, 1990) and in my view are not yet fully “science” at all. Those
wishing to go into this question in more detail should read Chapter 4.
The majority of my comments in this section are related to the empirical research
literature in management. A bit of background or review may be helpful. Empirical
research is usually conducted by forming a theoretical model, stated in the form of one or
more hypotheses; gathering data to measure the variables in the model; and then
examining the results. The paper derived from this process is then submitted to a
professional journal. It is almost certain that it will be subjected to blind peer review;
unlike the physical sciences, it is likely to be revised one or more times on the basis of
reviewer comments before finally being accepted for publication. However, the large
majority of articles submitted to the most prestigious of these journals are rejected.
The questions we ask in empirical research studies can almost all be reduced to
one of two forms—whether one group of items is different from another, or whether two
groups are associated with each other. All that we really measure in research are these
two things—differences or associations. Differences are usually measured as the
difference between averages or means, and association is usually measured as a
correlation. We refer to the difference or the association as the effect, and to its
magnitude as the effect size (technically, this is a “raw” effect, not “standardized,” but this
point need not concern us here).
For example, we want to know whether classroom training for workers actually
improves job performance. We may sample people given classroom training and people
not given classroom training, and compare their job performance to see whether there is a
difference. We could also see whether people given more hours of training perform better
than those with fewer hours, to see whether the amount of classroom training is associated
with job performance. The research question is to see whether differences in training are
related to differences in performance, i.e., a performance effect.
Since we call this kind of question an hypothesis, the overall process of collecting
data and analyzing it this way is hypothesis testing. Also, we almost always base our
studies on samples of people, rather than a whole population. We state our hypothesis in a
form known as a null hypothesis, which proposes that the groups to be evaluated are
assumed to be drawn from the same population, even if we really do not think they are.
For example, even though we sample people given classroom training and people not
given classroom training, the null hypothesis says that we assume there will be no
difference in performance between them, as if they came from the same untrained
population—this is why we call the hypothesis “null.” The objective of testing the null
hypothesis is to see whether we can reject it—the null hypothesis is a “straw man” we try
to knock down. If the difference is big enough, or the association strong enough, we reject
the null hypothesis, and conclude that we cannot say that classroom training did not make
a difference in job performance (but we still cannot say with certainty that it did). This
may seem a strange way to do empirical research, but this method lets us use some very
powerful statistical tools.
How do we know if we found anything? To answer that question, we need to look
at the effect size, and that is straightforward: if there is a difference between things, or an
association between things, how big is it? The problem is deciding when an effect is big
enough to be meaningful. There are no short, good answers to that question—in the end, it
falls on the researcher to conclude that the effect size is meaningful, and on the consumer
to decide if the researcher has made a convincing case.
So far, this all sounds very good, and these general methods can be applied to
many different kinds of data. This analytical technique can be a very powerful tool if used
correctly. Unfortunately, there are several problems in the soft social science model which
greatly limit the utility of much academic business research. Although I am treating them
separately, they are highly interdependent. Three problems in particular are endemic to
the soft social science research tradition: (1) the lack of research replication; (2) the
inability to cumulate or generalize the results of research; and (3) the incorrect
interpretation of statistical significance.
These problems are discussed in more detail in Chapter 4, but the practices of the
“soft” social science model are such that virtually no research is ever replicated, so that
errors may go undetected, challenges to published interpretations of findings are not likely
to be reported, or failures to replicate a published study are not reported. For a number of
reasons, there is a premium placed on novelty and originality in this research tradition, so
that nearly every study published is dissimilar in some respects to previous work (even on
the same subject), so cumulation of research toward reasonably firm conclusions is not
Finally, statistical significance has effectively become a substitute for effect sizes,
meaning that many conclusions based on “support” from statistical significance are highly
questionable, and any finding based on a large sample will be able to claim “support” for a
theory based on “rejection” of the null hypothesis. This is simply an error, and is a very
serious limitation to the credibility of research based on soft social science methods. For
present purposes, the only thing we need to be aware of is that the p level tells us the
likelihood that we got the data we did as a result of sampling error, or P(D|H), and nothing
The consumer should always be mindful of an obvious but crucial difference
between the social and physical sciences—the physical world can be counted on to behave
in ways determined by physical forces, and the world of human behavior cannot. In the
business world, this is especially true since organizations strive to differentiate themselves
from others, and to find a niche where they can succeed. This precaution becomes very
important when reading popular-market books “based on the research.” But because of
the limitations of academic research based on statistical significance, just about any
position on any argument can be “supported” in that literature as well. Most academic
researchers make the statistical misinterpretations discussed above, and the “best” journals
are filled with them.
Effect Size Is What Matters. How should we evaluate an empirical study if we need to?
The answer to that problem is straightforward: the user should consider the effect size, and
can use that to further evaluate the power of the test. “Power” in the broad sense (i.e, not
strictly just 1 - $) depends on three things: (1) the effect size, (2) the level of statistical
significance, as a check on P(D|H), and (3) sample size.
For example, if we measured two groups on a five-point scale of performance on
some task, is it important that the average for Group A is 3.87, and the average for Group
B is 3.99? On a five-point scale, a difference of 0.12 units (the effect) is so small we
would probably conclude it means nothing; on the other hand, means of 3.13 and 3.99 are
far enough apart to indicate that these groups probably differ in some material way. This
is exactly what “effect” means.
Effect sizes in tests of association are usually expressed as correlation coefficients
(r for simple correlation and R for multiple correlation). Correlations are best interpreted
conservatively as squared values: that interpretation is literally “the percentage of variance
in the dependent variable explained by the independent variable(s).” An r of .21 explains a
little more than 4 percent of the variance in the dependent variable (.0441, to be exact)—
the other 96 percent is unexplained.
The interpretation of findings through statistical power makes very good sense: we
consider sample size (and method, by implication) to determine if we have enough of the
right cases to measure what we want; we check statistical significance to estimate the risk
of sampling error; and then we see how big the effect is. Based primarily on the latter, we
make our call as to what the results tell us. (Although it is never seen in the management
or social sciences, I agree with Cohen (1990, 1994) that confidence intervals should be
reported as well.) When reading a paper, look for direct reporting of effects—usually
mean differences or correlation coefficients (there are also specialized and standardized
measures of effect size, but these are rarely used). Since results reported in many studies
are not measures of effect size, be prepared to simply disregard them.
There has been a recent important improvement in the reporting of results. Since
1995, the American Psychological Association has required publication of R2 and adjusted
R2. Shortly thereafter, the Academy of Management followed suit, and an article by
Waller, Huber, and Glick (1995) actually discussed the importance of Type II error (an
important source of error which is completely ignored when the criterion of merit is
statistical significance); there was even a short discussion of the issue of statistical power
in that article. In addition, there are some journals where publication of data to support
meta-analysis has been encouraged, to allow better accumulation of studies (see Hunter,
Schmidt, and Jackson (1982) for a clear and convincing explanation of meta-analysis, or
Schmidt (1992) for a briefer discussion; like Cohen (1990, 1994), I am a big fan of metaanalysis).
The American Psychological Association Task Force on Statistical Inference
(Wilkinson & the Task Force on Statistical Inference, 1999), about which we will hear
more in Chapter 4, also recommended some positive changes in reporting and discussing
research results. But unfortunately, the Task Force failed to fully confront the limitations
of significance testing, and that problem persists and in fact, is spreading and
contaminating other areas of research outside the soft social sciences.
Interpreting Survey Results. I noted earlier that if a researcher does not get all the
members of a survey sample, the results cannot be truly considered as representative. This
is where many surveys get into trouble, and can become misleading. Allowing
respondents to select themselves (actively or passively—i.e., allowing some participants to
not participate, or simply not chasing down the final hard-to-reach subjects) creates what
Norman Bradburn of the National Opinion Research Centers calls “SLOPS”—Self
seLected Opinion PollS (Tanur, 1994). However trendy or otherwise interesting, these
are not representative surveys. An example of a SLOPS survey is the famous “fax poll”
of Ross Perot during the 1992 Presidential campaign. He reported that the overwhelming
response of the “survey” (“50 percent of the American people”) was in favor of an
immediate balanced budget. While half of the people who faxed may have stated that
opinion, this was by no means a representative survey—only those who were already
watching, a self-selected audience, knew there was a survey being done, and of those the
only respondents were those with a fax machine (which the majority of the population has
The problem of wording of survey questions often produces results which are very
good at attracting attention, but very poor at providing information. Daniel Koshland,
editor of Science, refers to these as “Oy Veys,” rather than true surveys (Tanur, 1994).
Either of these can yield results which are fun, exciting, sensational, and often the raw
material for a book or an appearance on a talk show, but they are not science. An example
of this (and a SLOPS as well) was the famous Shere Hite survey of women’s relationships
conducted through Redbook magazine. She asked readers to complete an eight-page
longhand form describing women’s problems in their relationships with men, and then
published it as representative of the whole population. She made claims such as “98
percent of the women in the US feel men treat them in demeaning ways in their
relationships.” The criticism of her incorrect methods became so intense that she
eventually left the US, and now lives in Switzerland (doing the same thing, by the way).
The recent popular book Stiffed (Faludi, 1999) makes an equally serious error of nonrepresentative
sampling of American males, many of which would be considered fringe
groups. All of these are fun to read, but none of them are “science,” any more than the
conclusions one might reach by surveying customers as they enter a Wal-Mart.
Learning from Surveys—Four General Questions. The discussion of surveys
earlier also suggests four very good questions to keep in mind in evaluating nearly any
item from the management literature: (1) Who sponsored the survey (or study, or
article)?; (2) How was the sample determined (or source of any kind)?; (3) How were the
data collected?; (4) How were the questions worded? For cases involving specific
questionnaires, we might also ask, (5) How were options for responses arranged? The
first four questions are useful for just about anything in the literature—books and articles
from descriptive and non-research literature, as well as surveys and academic research.
“Keep It Simple, Stupid.” Another guideline is to place relatively more weight on
conservative statistical procedures, simple designs, and simple methods when making
judgements about the reliability of research claims. It is also completely appropriate to be
skeptical of studies relying on multivariate methods. This is contrary to much of the
conventional wisdom that leads to a bountiful career as an academic researcher, but
multivariate methods often produce statistics which are difficult to interpret clearly, such
as interaction effects. Many multivariate procedures rely on model assumptions which are
frequently unmet, and multivariate methods always capitalize on any form of random
association between measures, no matter what the source. Abelson (1995) points out that
many of these rely on omnibus tests and are about as precise as “playing a guitar while
In contrast, tests of differences between group means and tests of association
through correlation or contingency tests are robust, well understood, and have meaningful
interpretations. In a literature review, findings based on these simpler procedures should
always be given more weight than results from complex or multivariate procedures.
Simple procedures are always the most powerful.
If these guidelines are followed in evaluating the academic research, using it will
be much easier for the simple reason that the majority of it will be rejected as inconclusive.
And that is completely appropriate—no user wants to base important decisions on
ambiguous information, and if that is all there is, then we want to develop our own.
On the other hand, while the null hypothesis is something we continue to use in the
academic literature despite its flaws, it suggests a good approach to interpreting much of
what we read from any source: assume it is no different from anything else you have read
before. If you are convinced it had value at the end of the item, then it probably did. But
a willingness to ask hard questions, and a healthy skepticism, is a good perspective for any
consumer or any researcher.
Much of this chapter has focused on the limitations of the different categories of
management literature rather than the relative advantages. In part, I have chosen this
because any of the sources tend to be championed by those who find them most helpful, of
course; there is, in other words, an inherent form of “advocacy” which characterizes all the
different types of literature. Those who favor the richness and currency of company case
histories prefer the popular and “bridge-journal” presses; scholars and academics favor the
academic research journals; and so on.
But while I have been critical of all of these sources, each also has its advantages,
and no single source can provide all the information consumers might want for different
purposes. The point is that we need to look at both the roses and the thorns to have a
balanced view of what the literature has to offer. We are fortunate to have rich
information resources available to us, and we want to use that resource as intelligently and
effectively as we can. I hope this guide will provide some assistance in that direction.
Beware the “Canals on Mars”
There is a famous story about Sir Percival Lowell’s study of the canals on Mars, based on
his years of observations from the Lowell Observatory. He saw them because he “knew”
they were there from the 1877 work of Giovanni Schiaparelli, and carefully documented
and mapped how they changed over the seasons. He also found almost four times as
many canali as Schiaparelli! Having our present knowledge of Mars’ topography from
unmanned landings and close satellite photography, we now know there are no such
features at all. True believers in a particular management approach (or anything else) will
always “find” support for it. This is true for any kind of literature at all, and if even the
most careful of hard scientists can fall prey to it, the popular press and social sciences are
even more prone. Beware of the canals on Mars!
There are faxes for this order.
[ Order Custom Essay ]
[ View Full Essay ]