0% found this document useful (0 votes)
33 views5 pages

Using Data Mining To Help Design Sustainable Produ

The document discusses the use of data mining techniques to automate life-cycle assessment (LCA) for sustainable product design, allowing users to estimate environmental impacts with reasonable accuracy. It outlines a methodology called auto-LCA, which involves steps like matrix completion, clustering, and node translation to analyze product components and their environmental footprints. The authors present a case study on a server PCB, demonstrating the effectiveness of their approach in identifying major environmental contributors.

Uploaded by

zaz11.dent
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views5 pages

Using Data Mining To Help Design Sustainable Produ

The document discusses the use of data mining techniques to automate life-cycle assessment (LCA) for sustainable product design, allowing users to estimate environmental impacts with reasonable accuracy. It outlines a methodology called auto-LCA, which involves steps like matrix completion, clustering, and node translation to analyze product components and their environmental footprints. The authors present a case study on a server PCB, demonstrating the effectiveness of their approach in identifying major environmental contributors.

Uploaded by

zaz11.dent
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

See discussions, stats, and author profiles for this publication at: [Link]

net/publication/220475904

Using Data Mining to Help Design Sustainable Products

Article in Computer · August 2011


DOI: 10.1109/MC.2011.257 · Source: DBLP

CITATIONS READS
22 1,264

5 authors, including:

Manish Marwah Cullen Bash


HP Inc. HP Inc.
94 PUBLICATIONS 2,714 CITATIONS 168 PUBLICATIONS 4,213 CITATIONS

SEE PROFILE SEE PROFILE

Chandrakant D Patel Naren Ramakrishnan


HP Inc. Virginia Tech
131 PUBLICATIONS 4,619 CITATIONS 534 PUBLICATIONS 10,185 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Chandrakant D Patel on 26 August 2014.

The user has requested enhancement of the downloaded file.


DIS C OV ERY A N A LY T IC S

Using Data Mining


to Help Design
Sustainable Products
Manish Marwah, Amip Shah, Cullen Bash,
and Chandrakant Patel, HP Labs
Naren Ramakrishnan, Virginia Tech

Data mining techniques make it possible to automate product


life-cycle assessment with reasonable accuracy, even in cases of
low-quality inventory data.

W
hen ma k ing pu r- Impact,” Computer, July 2009, pp. AUTO-LCA
chases, customers 91-93). Researchers could use LCA Given the large-scale manual
increa singly con- to answer questions such as: How do processing of semistructured data
sider product s’ Apple iPad, Samsung Galaxy Tab, or associated with performing LCA,
environmental impact as well as HP TouchPad compare in terms of we redefine LCA as a data mining
traditional criteria such as cost and their carbon footprint? Is an e-reader problem and integrate data mining
features. In 2008, market research more environmentally friendly than solutions from different contexts to
firm Gartner estimated that about 75 a paper book? (D. Goleman and G. obtain an auto-LCA methodology.
percent of enterprises would include Norris, “How Green Is My iPad?” The We consider a product inventory,
some type of life-cycle environmen- New York Times, 4 Apr. 2010). such as a bill of materials (BOM),
tal assessment in their purchasing However, LCA can be a manual and to be a compositional containment
decisions about future IT systems laborious process. Accurately estimat- hierarchy and represent it as a tree.
(D. Plummer et al., “Gartner’s Top ing the environmental impact factors For example, a desktop computer
Predictions for IT Organizations and associated with a server, for example, contains a printed circuit board
Users, 2008 and Beyond,” report may involve creating a detailed inven- (PCB), which contains ICs, capaci-
G00154035, 8 Jan. 2008). The recent tory of all its components, usually tors, and resistors; these components
surge in ecolabels and green stick- down to parts such as integrated cir- in turn contain silicon and other
ers also indicates growing public cuits (ICs), resistors, fans, heat sinks, metals.
sentiment that consumer offerings and even screws, paint, and labels; An auto-LCA system requires two
should meet minimum sustainabil- estimating their mass or volume; and, databases: a product database, which
ity requirements or limit harm to the finally, mapping each component to includes each product’s BOM as well
environment. representative entries in an environ- as information such as part number
A product’s environmental foot- mental database. and description; and an environmen-
print is typically estimated through We treat LCA as a data mining tal database of generic information
life-cycle assessment (LCA), which problem and propose an automated about the environmental impact
takes a comprehensive view of mul- LCA (auto-LCA) approach that lets a of various components—for exam-
tiple environmental impacts such user simply input on existing product ple, the Ecoinvent database (www.
as greenhouse gas emissions, toxic- inventory and obtain an approximate [Link]).
ity, and carcinogenicity (A. Shah et environmental footprint of all its com- As Figure 1 shows, the path from
al., “Assessing ICT’s Environmental ponents as output. BOM to environmental footprint

0018-9162/11/$26.00 © 2011 IEEE Published by the IEEE Computer Society AUGUST 2011 103
DIS C OV ERY A N A LY T IC S

(LCS), longest common prefix (LCP),


Automated LCA system
Levenshtein distance (LD), or a com-
bination of these. Once we obtain a
distance metric, we employ cluster-
Environmental 1. Matrix completion
impact Standardized ing algorithms such as k-medoids to
database, environmental group similar BOM nodes together.
for example, nodes The resulting clusters reduce the
Ecoinvent number of parts to be evaluated from
up to several thousand to a smaller,
3. Node translation more manageable number—typically,
BOM clusters mapped • • at least an order of magnitude lower.
to standardized • • …
environmental nodes • • Node translation
We then assign each of these clus-
4. Tree ters a representative node similar to
• • reconstruction
BOM node its medoid from the environmental
… …
clusters database. In this way, we “trans-
• Estimate quantity
of each component late” BOMs associated with distinct
2. Clustering products that come from various
suppliers and have different naming
Product tree, Approximate schemes into a standard terminol-
Product environmental footprint ogy derived from the environmental
for example,
database … of each component
BOM database, thereby yielding insight into
the environmental impact related to
each cluster. Ideally, such transla-
Customer / User system(s)
tion would be learned based on some
training data or by comparing BOM
Figure 1. Automated life-cycle analysis (auto-LCA) methodology. Obtaining a product’s
and environmental node descriptions,
approximate environmental footprint from its bill of materials (BOM) entails four main
but we currently perform this trans-
steps: matrix completion, clustering, node translation, and tree reconstruction.
lation manually. It’s worth noting
that clustering allows such manual
entails four main steps: matrix com- Clustering translation, as it reduces the number
pletion, clustering, node translation, Next, we perform cluster analysis of required translations by more than
and tree reconstruction. on the potentially hundreds of items an order of magnitude.
listed in the BOM with the objective of
Matrix completion grouping similar nodes—those likely Tree reconstruction
We first check the environmental to have comparable environmental A challenge for translation is that
database for incomplete or invalid impact—together. The clustering the units specified in the BOM nodes
information. For example, many such algorithm requires a distance metric and the environmental database nodes
databases only have a few impact computed from the node attributes can differ. For example, most product
factors, such as energy and ecotoxic- to posit groupings. Simply using part BOMs specify the number of repeating
ity, listed for certain nodes, and lack numbers isn’t sufficient, as many instances for a particular part, while
relevant information such as carbon BOM components could be quite the environmental nodes could be
footprint data or quantified impacts similar from the standpoint of envi- specified by mass (kg). To rectify this,
on human health. The environmental ronmental impact but very different we use the property that for any envi-
database can be viewed as a matrix in terms of how they’re identified in ronmental impact, the sum of the child
with components as rows and impact the product tree. For example, two node impact values approximately
factors as columns. Estimating miss- identical stainless steel screws that equals that of the parent (root), form-
ing values thus constitutes a matrix reside in different parts of the system ing a linear system of simultaneous
completion task. This problem is might have distinct part numbers. equations with the coefficients of the
similar to that encountered in recom- To compute the distance between child nodes being the unknowns. To
mender systems, in which the goal is node descriptions, we use approxi- fully reconstruct the BOM tree com-
to estimate missing ratings for unseen mate string-matching techniques prising environmental nodes, we must
items such as movies and books. such as longest common subsequence estimate these coefficients.

104 COMPUTER
We perform a least squares regres-
Server PCB
sion fit to best estimate the coefficients.
Because the coefficients must be posi-
tive, the goal is to obtain a non-negative



least squares (NNLS) fit (C.L. Lawson
and R.J. Hanson, Solving Least Squares
Problems, Society for Industrial and


Applied Mathematics, 1995). Knowing "Resistor" cluster
"Capacitor" cluster (additional clusters
a single node’s weight (the root or one not shown
of the child nodes) lets us compute the Translated environmental node(s): for brevity) Translated environmental node(s):
environmental contribution of each 7012 (Cap, electrolytic) 7069 (Resistor, metallic film)
child node to the total (parent) impact. 7009 (Cap, thin-film)
7010 (Cap, SMD) Reconstructed node coefficients
CASE STUDY: SERVER PCB 7069 → 0.017
Reconstructed node coefficients
With these building blocks in place, Clustered BOM nodes:
7012 → 0.018
it’s possible to estimate the envi- 60130B68R09T RES-CHIP-68-1%-1/10W
ronmental footprint of an arbitrary 7009 → 0.012 6013A0038101 RES-CHIP-200-1%-1/10W
product tree or BOM. We illustrate 7010 → 0.005 6013A0059101 RES-CHIP-0-5%-1/10W
6013A0078001 RES-CHIP-91-1%-1/10W
the approach by analyzing a real 6013A0085801 RES-CHIP-4.7K-1%-1/10W
Clustered BOM nodes:
PCB from an enterprise server. This 6010A0017601 CAP-CHIP-0.1MFD-16V-K-X7R- 6013A0088009 RES-CHIP-1K-1%-1
PCB BOM contains about 560 com- 06036010A0020001 CAPACITOR-TANT,10uF,16V 6013A0091701 RES-CHIP-33-1%-1/10W
6010A0036104 CAP-CHIP-15PF-50V-J-NPO-0402 6013B0079801 RESISTOR-CHIP,2.2K,1%,1/16W
ponents, including a mix of resistors, 6013B0095401 RESISTOR-CHIP,300,1%,1/16W,
6010A0036107 CAP-CHIP-22PF-50V
capacitors, ASICs, and logic devices.


6010A003610J CAP-CHIP-100PF-50V
We cluster the BOM nodes using 6010A003620N CAP-CHIP-270PF-50V (additional nodes not listed for brevity)
k-medoids to identify 22 unique clus- CAPACITOR-AL-SP,150uF,6.3V,M

ters. Figure 2 shows two examples. It’s


relatively easy to translate these clus- (additional nodes not listed for brevity)
ters (actually their medoids) into a list
Figure 2. Server printed circuit board (PCB) clusters, node translations, and coefficients.
of nodes from the environmental data-
base. For the resulting environmental
tree, we can utilize the impact factors
available from the environmental data-
base and successfully solve for the
coefficients of the child nodes. Figure Components
2 includes the resulting coefficients of ASIC/logic
select nodes in the environmental tree, Battery
which enables readily computing each Capacitor
child node’s environmental impact. Connector
The median error between the sum of Diode
the child node impact factors and the Inductor
parent, for about 200 impact factors, Resin
is only 12.8 percent, a highly satisfac- Resistor
tory result.
After obtaining each child node’s
environmental footprint, we perform
an environmental “hotspot” analysis.
This essentially involves generating a
Pareto list of the largest environmen-
Figure 3. Results of environmental “hotspot” analysis. ICs are the biggest contributor
tal contributors to the overall PCB
to the PCB’s overall carbon footprint.
footprint so that a designer or LCA
practitioner can focus on those areas
requiring further effort. the biggest contributor to the PCB’s and capacitors. These automated
As Figure 3 shows, due to their overall carbon footprint, followed by results match those obtained from a
upstream manufacturing, ICs are the use of copper in the connectors manual LCA, and are easy to under-

AUGUST 2011 105


DIS C OV ERY A N A LY T IC S

stand even for those unfamiliar tion, clustering, node translation, and Cullen Bash is a distinguished technol-
with LCA or environmental impact tree reconstruction make it possible ogist at HP Labs, Palo Alto, California.
analysis. to automate LCA with reasonable Contact him at [Link]@[Link].
PCB designers can use our auto- accuracy and within fairly broad con- Chandrakant Patel is a senior fellow
LCA approach to assess their design’s straints, even in cases of low-quality at HP Labs, Palo Alto, California.
sustainability against that of other inventory data. In the future, we plan Contact him at [Link]@
designs as well. We anticipate eventu- to further evaluate these algorithms’ [Link].
ally creating a tool that automatically scalability and test auto-LCA on a Naren Ramakrishnan, the Discovery
scans similar ICs preloaded into the wider variety of systems. Analytics column editor, is a professor
environmental database to aid in this of computer science at Virginia Tech.
assessment. Manish Marwah is a senior research Contact him at naren@[Link].
scientist at HP Labs, Palo Alto, Califor-

C
onsumers as well as enter- nia. Contact him at [Link]@
prises today demand more [Link].
information about products’ Amip Shah is a senior research scien- Selected CS articles and columns
environmental impact. Data mining tist at HP Labs, Palo Alto, California. are available for free at http://
techniques such as matrix comple- Contact him at [Link]@[Link]. [Link].

Broadcom Corp.
is seeking a

Engineer, Sr.
is seeking an in Sunnyvale, CA is seeking an
Staff – IC
Engineer, II- Engineer, Sr. Staff-
Design Electronic Design Systems Design
Req. BS (or foreign equiv.) in EE, Req. MS in EE or Electronic Engrg. Req. MS (or foreign equiv.) in EE,
Electronics Engrg. or Comm. Engrg. Electronics Engg, CS, or rel. Develop
and 5 yrs. exp. to develop multidimen- to perform block-level circuit
designs & block-level circuit layout and debug low-level system software
sional designs involving complex
for a very high volume mobile and
integrated circuits. Travel required. implementation. Travel required.
Broadcom Corp. San Diego, CA. F/T. embedded consumer electronics chips
Broadcom Corp. Austin, TX . F/T. and designs. May req. up to 5% domes-
Must have unrestricted U.S. work Must have unrestricted U.S. work
authorization. tic travel. F/T. Must have unrestricted
authorization. U.S. work authorization.
Mail resumes to: Mail resumes to: Mail resumes to:
HR Operations Coordinator HR Operations Coordinator HR Operations Coordinator
5300 California Ave. 5300 California Ave. 5300 California Ave.
Bldg. 2, #22108 Bldg. 2, #22108 Bldg. 2, #22108A
Irvine, CA 92617 Irvine, CA 92617 Irvine, CA 92617
Must reference Must reference Must reference
job code ENG6-SDCADA. job code ENG7-AUTXVG. job code ENG7-SVCASP.

106 COMPUTER

View publication stats

You might also like