Disambiguating Advanced Computing for Humanities Researchers Baden Hughes Department of Computer Science and Software Engineering University of Melbourne [email_address]
Agenda Towards Data Intensive Research in the Humanities A Thesis Motivation for this Talk The Disambiguation Task Architectural Characteristics Application Execution Models Integration and Middleware Interfaces Extending the Computational Boundaries Conclusion
Towards Data Intensive Research in the Humanities Humanities data is plentiful, i.e., in fields such as history, linguistics, archaeology, musicology, art and literature  Exploitation of large collections of data require the efficiency provided through automated analysis if they are to be exploited systematically and exhaustively A barrier to date to the deployment of computational techniques has been the acquisition of data in digital form However, data is becoming available regularly (even freely), and there are strong indications that tendency will continue into the future Computational analysis of large volumes of data has innate challenges, even for domain experts Renewed engagement with traditional questions in the humanities through this computationally-enabled data-centric approach may allow us to answer old questions, and discover new ones
A Thesis We are beginning to discover analytical needs within humanities computing disciplines which also exceed available computational resources, especially with the growing popularity of data-intensive research Since other domains have already approached this point, and engineered solutions to the problem, it is possible for humanities researchers to find synergies, adopting existing methods to solve our own research problems Conversely, we may offer new techniques to other research communities which may in turn enable them to attain their research objectives This symbiosis, derived from the common locus of computational enablement of basic research, offers benefits to humanities researchers to enable them to engage with their research in a new expository fashion
Motivation for this Talk Computational tractability is increasingly embedded in humanities research methodologies  Despite widespread adoption, humanities computing is often characterised as being “less analytically complex” and on a “smaller scale” when compared to that of more “scientific” computing The inherent scalability of humanities computing solutions has so far been largely not addressed since commodity computing has been deemed sufficient for achieving analytical goals within tolerable timeframes Contrastively, in scientific domains, analytical complexity has far surpassed the capacity of commodity computing, and thus new solutions have been sought, and found The adoption of such solutions has allowed scientific research to identify and pursue new avenues of investigation which were previously impossible owing to purely computational constraints
The Disambiguation Task Defining “Advanced Computing”:  Computational capability beyond that ordinarily available to researchers “ Advanced Computing” therefore includes services which allow resource sharing (data, services and computational cycles), selection and aggregation of resources (distributed by topology or geography) for solving large-scale research problems Foundational to enabling humanities researchers to take advantage of these resources and services is the need to understand the typology of the advanced computing landscape, and a lowering of the barrier to entry at both descriptive and technical levels Here we seek provide an accessible overview of the foundational components of advanced computing, motivated by the desire to inform humanities researchers of the nature of these new paradigms
Architectural Characteristics Advanced computing services come in many forms, and have a somewhat interchangeable nomenclature A simple typological approach is useful in understanding the characteristics of a number of common forms Single CPU: what most of us have as workstations – commodity hardware Shared Disk Systems: n-CPUs with a common disk bank – almost commoditised, and an order of magnitude more powerful than a single CPU (eg SMP) Shared Memory Systems: n-CPUs with a common memory bank (increasingly rare) Cluster: local, coordinated computational array, typically with shared storage but based on commodity hardware Grid: distributed, coordinated computational array, typically commodity hardware, heavily dependent on software for integration A range of complementary technologies exist within the advanced computing domain including: high capacity, low latency bandwidth large online data storage metadata and catalogues instrumentation
Application Execution Models Decomposition of existing applications can reveal affinities between in situ processing models and the processing models prevalent in the advanced computing domain Serial/pipeline application execution models are the default for many areas of computing (scientific as well as humanities) Naturally, much greater throughput can be gained through parallel execution models, and this is the basic mode employed in advanced computing of all kinds Parallelism can be derived from a number of areas: data-centric parallelism parametric parallelism In using advanced computing services, finding opportunities to parallelise processing is an important first step Not all tasks can be parallelised, some have greater natural affinity than others (eg segmented data, parameter space traversal) There are also lower bounds on the efficiency of parallelisation (the I/O to computation ratio)
Integration and Execution Advanced computing services like grids are typically aggregations of many smaller machines Middleware provides the “glue” which allows advanced computing services to be treated as a single machine for interface purposes There are many middleware vendors, with great diversity in approach and functionality The lower middleware layer is concerned mainly with infrastructure management (and can be largely ignored) Computational service discovery, aggregation and coordination Authentication and Security Instrumentation The upper middleware layer is mainly concerned with execution management (and is one of the points of interface) Batch queuing: application instances in a queue for processing Execution brokering: dynamic determination of parameters, and/or generation of application instances, and monitoring of execution progress Avoiding the middleware layer entirely …
Interfaces Interfaces to advanced computational services are polymorphic Some require fundamental changes to technical approach on behalf of the researcher Others are easily integrated and offer simple but functional access Simple batch queue interfaces allow submission, execution and collation of experiment output Globus and derivatives NorduGrid’s ARC Many popular programming languages have native-like support for execution in advanced computational environments C/C++: libraries Java: classes, threads Python: wrappers, modules, threads Perl: wrappers, modules, threads Web Services: services Some specialised frameworks have native support for parallel, clustered and distributed execution
Extending the Computational Boundaries Adopting computational approaches can impact our research methodologies Increasing size of raw data collections can be efficiently analysed computationally  Increasing complexity of analysis can be facilitated computationally On the horizon is computational capability beyond the bounds imposed by any individual researchers’ computational environment Not only do advanced computing services offer new capabilities, but also spawn opportunities for new types of research collaboration Motivated by basic scientific enquiry to test the adequacy of answers to questions we thought were answered find answers to older unanswered questions discover new questions
Conclusion Advanced computing offers new opportunities in humanities research, both in terms of methodology and technology Advanced computing takes many different forms, each of which could be more or less applicable to individual research programs  Humanities computing can offer insights into enablement questions within the scientific and computational community, and our contribution is welcome Challenges remain in the area of increased accessibility of advanced computing services to humanities researchers, but utility-style computing is a key goal of the advanced computing research community Renewed engagement with traditional questions in the humanities through this computationally-enabled data-centric approach may allow us to answer old questions, and hopefully discover new ones waiting to be answered

More Related Content

DOC
Gcc notes unit 1
PDF
A Comparative Study: Taxonomy of High Performance Computing (HPC)
PDF
GRID COMPUTING PRESENTATION
PDF
Computation grid as a connected world
PDF
"Volunteer Computing with BOINC Client-Server side" por Diamantino Cruz e Ric...
PDF
A Comparison of Cloud Execution Mechanisms Fog, Edge, and Clone Cloud Computing
Gcc notes unit 1
A Comparative Study: Taxonomy of High Performance Computing (HPC)
GRID COMPUTING PRESENTATION
Computation grid as a connected world
"Volunteer Computing with BOINC Client-Server side" por Diamantino Cruz e Ric...
A Comparison of Cloud Execution Mechanisms Fog, Edge, and Clone Cloud Computing

What's hot (20)

PDF
Core of Cloud Computing
PDF
Implementing K-Out-Of-N Computing For Fault Tolerant Processing In Mobile and...
PPT
Grid computing
PDF
Bt9002 grid computing 1
PPTX
Applications of SOA and Web Services in Grid Computing
PDF
CYBER INFRASTRUCTURE AS A SERVICE TO EMPOWER MULTIDISCIPLINARY, DATA-DRIVEN S...
PPTX
Grid Computing (An Up-Coming Technology)
PDF
Security and privacy issues of fog
PDF
Cloud versus cloud
PDF
Syllabus cse
PDF
Cs6703 grid and cloud computing book
PDF
Emerging cloud computing paradigm vision, research challenges and development...
PDF
F2CDM: Internet of Things for Healthcare Network Based Fog-to-Cloud and Data-...
PDF
International Journal of Engineering Research and Development
PDF
A review on orchestration distributed systems for IoT smart services in fog c...
PDF
The Riisk and Challllenges off Clloud Computtiing
PDF
IRJET- Fog Route:Distribution of Data using Delay Tolerant Network
PDF
3. the grid new infrastructure
PPT
Grid computing
PDF
Analyzing the Difference of Cluster, Grid, Utility & Cloud Computing
Core of Cloud Computing
Implementing K-Out-Of-N Computing For Fault Tolerant Processing In Mobile and...
Grid computing
Bt9002 grid computing 1
Applications of SOA and Web Services in Grid Computing
CYBER INFRASTRUCTURE AS A SERVICE TO EMPOWER MULTIDISCIPLINARY, DATA-DRIVEN S...
Grid Computing (An Up-Coming Technology)
Security and privacy issues of fog
Cloud versus cloud
Syllabus cse
Cs6703 grid and cloud computing book
Emerging cloud computing paradigm vision, research challenges and development...
F2CDM: Internet of Things for Healthcare Network Based Fog-to-Cloud and Data-...
International Journal of Engineering Research and Development
A review on orchestration distributed systems for IoT smart services in fog c...
The Riisk and Challllenges off Clloud Computtiing
IRJET- Fog Route:Distribution of Data using Delay Tolerant Network
3. the grid new infrastructure
Grid computing
Analyzing the Difference of Cluster, Grid, Utility & Cloud Computing
Ad

Viewers also liked (20)

PDF
Johan Ronnestam. Keynote. Webbdagarna 2012.
PDF
2nd Trimester Sponges
PPTX
Versie 3 Nv Iad2 0910 Q1 Les 4 Patterns For Mobile
PDF
2nd Trimester Sponges
PPTX
Tips for Using Semi-Colons
PPT
Legal aspects of data gathering and information exchange
PPTX
Unit 2 3 1 Costs Of Production
PPTX
Unit 2
PPT
Visuelereisdoororganisatielandv2 1225371301186079 8
PPT
0708 Iad1 Werkgroep1
PPT
Zappos - eBay Talk - 04-22-08
PPTX
Minor User Experience English
PDF
Object Reuse and Exchange (ORE) : Experience in the Open Language Archives Co...
PPT
User Created Content deel II
PPT
Elasticity 1
PPTX
1011q1 design for mobile les 1 the carry principle
PDF
Week8 Sponges
PDF
Analysis of social computing applications in the EU
PDF
Unbearable vocabulary
PDF
Week 16 Sponges
Johan Ronnestam. Keynote. Webbdagarna 2012.
2nd Trimester Sponges
Versie 3 Nv Iad2 0910 Q1 Les 4 Patterns For Mobile
2nd Trimester Sponges
Tips for Using Semi-Colons
Legal aspects of data gathering and information exchange
Unit 2 3 1 Costs Of Production
Unit 2
Visuelereisdoororganisatielandv2 1225371301186079 8
0708 Iad1 Werkgroep1
Zappos - eBay Talk - 04-22-08
Minor User Experience English
Object Reuse and Exchange (ORE) : Experience in the Open Language Archives Co...
User Created Content deel II
Elasticity 1
1011q1 design for mobile les 1 the carry principle
Week8 Sponges
Analysis of social computing applications in the EU
Unbearable vocabulary
Week 16 Sponges
Ad

Similar to Disambiguating Advanced Computing for Humanities Researchers (20)

PDF
Artificial immune systems and the grand challenge for non classical computation
PPTX
Classification of computers
PDF
The big data_computing_architecture-graph500
PDF
The big data_computing_architecture-graph500
PDF
ID 259 Poster
PDF
ID 259 Poster
PDF
Graham Pryor
PDF
Frontiers of Supercomputing N. Metropolis (Editor)
DOCX
Human centered computing
PPTX
Sc10 slide share
PPTX
Why manage research data?
PDF
Bertenthal
PDF
MapReduce: Distributed Computing for Machine Learning
PDF
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
PDF
Introduction to agents and multi-agent systems
DOC
Seminar
PDF
INTERFACE, by apidays - APIs of the Future: Are you Ready? by Mike Amundsen
PDF
Computational Frameworks. Systems, Models and Applications Mamadou Kaba Traor...
PPTX
DH2012_Bellamy
PPTX
network ram parallel computing
Artificial immune systems and the grand challenge for non classical computation
Classification of computers
The big data_computing_architecture-graph500
The big data_computing_architecture-graph500
ID 259 Poster
ID 259 Poster
Graham Pryor
Frontiers of Supercomputing N. Metropolis (Editor)
Human centered computing
Sc10 slide share
Why manage research data?
Bertenthal
MapReduce: Distributed Computing for Machine Learning
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
Introduction to agents and multi-agent systems
Seminar
INTERFACE, by apidays - APIs of the Future: Are you Ready? by Mike Amundsen
Computational Frameworks. Systems, Models and Applications Mamadou Kaba Traor...
DH2012_Bellamy
network ram parallel computing

More from Baden Hughes (13)

PDF
Closing the Gap: Data Models for Documentary Linguistics
PDF
Managing Perl Installations: A SysAdmin's View
PDF
If We're Not There Yet, How Far Do We Have To Go ? Web Metadata at The Univer...
PDF
Building Computational Grids with Apple’s Xgrid Middleware
PPT
Functional Requirements for an Interlinear Text Editor
PPT
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
PPT
Metadata Quality Evaluation: Experience from the Open Language Archives Commu...
PPT
Encoding and Presenting Interlinear Text Using XML Technologies
PDF
Refactoring Metadata:
PDF
Towards a Web Search Service for Minority Language Communities
PDF
Change Management and Versioning in Ontologies
PDF
The Effects of Cross-Pollination : How non-library mass market services are c...
PDF
Why Digitization Increases the Value of Print Collections
Closing the Gap: Data Models for Documentary Linguistics
Managing Perl Installations: A SysAdmin's View
If We're Not There Yet, How Far Do We Have To Go ? Web Metadata at The Univer...
Building Computational Grids with Apple’s Xgrid Middleware
Functional Requirements for an Interlinear Text Editor
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
Metadata Quality Evaluation: Experience from the Open Language Archives Commu...
Encoding and Presenting Interlinear Text Using XML Technologies
Refactoring Metadata:
Towards a Web Search Service for Minority Language Communities
Change Management and Versioning in Ontologies
The Effects of Cross-Pollination : How non-library mass market services are c...
Why Digitization Increases the Value of Print Collections

Recently uploaded (20)

PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
DOCX
search engine optimization ppt fir known well about this
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PDF
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PDF
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
PPTX
MuleSoft-Compete-Deck for midddleware integrations
PDF
Lung cancer patients survival prediction using outlier detection and optimize...
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
Advancing precision in air quality forecasting through machine learning integ...
PDF
4 layer Arch & Reference Arch of IoT.pdf
PDF
Flame analysis and combustion estimation using large language and vision assi...
PDF
Auditboard EB SOX Playbook 2023 edition.
PDF
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
PDF
NewMind AI Weekly Chronicles – August ’25 Week IV
Taming the Chaos: How to Turn Unstructured Data into Decisions
search engine optimization ppt fir known well about this
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
Enhancing plagiarism detection using data pre-processing and machine learning...
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
MuleSoft-Compete-Deck for midddleware integrations
Lung cancer patients survival prediction using outlier detection and optimize...
Convolutional neural network based encoder-decoder for efficient real-time ob...
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
The influence of sentiment analysis in enhancing early warning system model f...
Custom Battery Pack Design Considerations for Performance and Safety
Advancing precision in air quality forecasting through machine learning integ...
4 layer Arch & Reference Arch of IoT.pdf
Flame analysis and combustion estimation using large language and vision assi...
Auditboard EB SOX Playbook 2023 edition.
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
NewMind AI Weekly Chronicles – August ’25 Week IV

Disambiguating Advanced Computing for Humanities Researchers

  • 1. Disambiguating Advanced Computing for Humanities Researchers Baden Hughes Department of Computer Science and Software Engineering University of Melbourne [email_address]
  • 2. Agenda Towards Data Intensive Research in the Humanities A Thesis Motivation for this Talk The Disambiguation Task Architectural Characteristics Application Execution Models Integration and Middleware Interfaces Extending the Computational Boundaries Conclusion
  • 3. Towards Data Intensive Research in the Humanities Humanities data is plentiful, i.e., in fields such as history, linguistics, archaeology, musicology, art and literature Exploitation of large collections of data require the efficiency provided through automated analysis if they are to be exploited systematically and exhaustively A barrier to date to the deployment of computational techniques has been the acquisition of data in digital form However, data is becoming available regularly (even freely), and there are strong indications that tendency will continue into the future Computational analysis of large volumes of data has innate challenges, even for domain experts Renewed engagement with traditional questions in the humanities through this computationally-enabled data-centric approach may allow us to answer old questions, and discover new ones
  • 4. A Thesis We are beginning to discover analytical needs within humanities computing disciplines which also exceed available computational resources, especially with the growing popularity of data-intensive research Since other domains have already approached this point, and engineered solutions to the problem, it is possible for humanities researchers to find synergies, adopting existing methods to solve our own research problems Conversely, we may offer new techniques to other research communities which may in turn enable them to attain their research objectives This symbiosis, derived from the common locus of computational enablement of basic research, offers benefits to humanities researchers to enable them to engage with their research in a new expository fashion
  • 5. Motivation for this Talk Computational tractability is increasingly embedded in humanities research methodologies Despite widespread adoption, humanities computing is often characterised as being “less analytically complex” and on a “smaller scale” when compared to that of more “scientific” computing The inherent scalability of humanities computing solutions has so far been largely not addressed since commodity computing has been deemed sufficient for achieving analytical goals within tolerable timeframes Contrastively, in scientific domains, analytical complexity has far surpassed the capacity of commodity computing, and thus new solutions have been sought, and found The adoption of such solutions has allowed scientific research to identify and pursue new avenues of investigation which were previously impossible owing to purely computational constraints
  • 6. The Disambiguation Task Defining “Advanced Computing”: Computational capability beyond that ordinarily available to researchers “ Advanced Computing” therefore includes services which allow resource sharing (data, services and computational cycles), selection and aggregation of resources (distributed by topology or geography) for solving large-scale research problems Foundational to enabling humanities researchers to take advantage of these resources and services is the need to understand the typology of the advanced computing landscape, and a lowering of the barrier to entry at both descriptive and technical levels Here we seek provide an accessible overview of the foundational components of advanced computing, motivated by the desire to inform humanities researchers of the nature of these new paradigms
  • 7. Architectural Characteristics Advanced computing services come in many forms, and have a somewhat interchangeable nomenclature A simple typological approach is useful in understanding the characteristics of a number of common forms Single CPU: what most of us have as workstations – commodity hardware Shared Disk Systems: n-CPUs with a common disk bank – almost commoditised, and an order of magnitude more powerful than a single CPU (eg SMP) Shared Memory Systems: n-CPUs with a common memory bank (increasingly rare) Cluster: local, coordinated computational array, typically with shared storage but based on commodity hardware Grid: distributed, coordinated computational array, typically commodity hardware, heavily dependent on software for integration A range of complementary technologies exist within the advanced computing domain including: high capacity, low latency bandwidth large online data storage metadata and catalogues instrumentation
  • 8. Application Execution Models Decomposition of existing applications can reveal affinities between in situ processing models and the processing models prevalent in the advanced computing domain Serial/pipeline application execution models are the default for many areas of computing (scientific as well as humanities) Naturally, much greater throughput can be gained through parallel execution models, and this is the basic mode employed in advanced computing of all kinds Parallelism can be derived from a number of areas: data-centric parallelism parametric parallelism In using advanced computing services, finding opportunities to parallelise processing is an important first step Not all tasks can be parallelised, some have greater natural affinity than others (eg segmented data, parameter space traversal) There are also lower bounds on the efficiency of parallelisation (the I/O to computation ratio)
  • 9. Integration and Execution Advanced computing services like grids are typically aggregations of many smaller machines Middleware provides the “glue” which allows advanced computing services to be treated as a single machine for interface purposes There are many middleware vendors, with great diversity in approach and functionality The lower middleware layer is concerned mainly with infrastructure management (and can be largely ignored) Computational service discovery, aggregation and coordination Authentication and Security Instrumentation The upper middleware layer is mainly concerned with execution management (and is one of the points of interface) Batch queuing: application instances in a queue for processing Execution brokering: dynamic determination of parameters, and/or generation of application instances, and monitoring of execution progress Avoiding the middleware layer entirely …
  • 10. Interfaces Interfaces to advanced computational services are polymorphic Some require fundamental changes to technical approach on behalf of the researcher Others are easily integrated and offer simple but functional access Simple batch queue interfaces allow submission, execution and collation of experiment output Globus and derivatives NorduGrid’s ARC Many popular programming languages have native-like support for execution in advanced computational environments C/C++: libraries Java: classes, threads Python: wrappers, modules, threads Perl: wrappers, modules, threads Web Services: services Some specialised frameworks have native support for parallel, clustered and distributed execution
  • 11. Extending the Computational Boundaries Adopting computational approaches can impact our research methodologies Increasing size of raw data collections can be efficiently analysed computationally Increasing complexity of analysis can be facilitated computationally On the horizon is computational capability beyond the bounds imposed by any individual researchers’ computational environment Not only do advanced computing services offer new capabilities, but also spawn opportunities for new types of research collaboration Motivated by basic scientific enquiry to test the adequacy of answers to questions we thought were answered find answers to older unanswered questions discover new questions
  • 12. Conclusion Advanced computing offers new opportunities in humanities research, both in terms of methodology and technology Advanced computing takes many different forms, each of which could be more or less applicable to individual research programs Humanities computing can offer insights into enablement questions within the scientific and computational community, and our contribution is welcome Challenges remain in the area of increased accessibility of advanced computing services to humanities researchers, but utility-style computing is a key goal of the advanced computing research community Renewed engagement with traditional questions in the humanities through this computationally-enabled data-centric approach may allow us to answer old questions, and hopefully discover new ones waiting to be answered