Schneider J, Fuhr D, Korte E and Thim C (2017), "Security-Self-Assessment in kritischen Infrastrukturen", In D·A·CH Security. München, 9, 2017. |
Abstract: Die Bedrohungslage bezüglich Cybervorfällen hat sich für kritische Infrastrukturen in den letzten Jahren
weiter verschärft. Die Regulierung versucht dem mit Vorgaben für große Betreiber wie dem IT-Sicherheitsgesetz gerecht zu werden. Kleine und mittlere Betreiber kritischer Infrastrukturen stehen damit vor der doppelten Herausforderung, mangels gesetzlicher Anforderungen einerseits ihre eigene Lösung finden und zweitens dies mit ihrem begrenzten Budget an Finanzen und Personal stemmen zu müssen. Im BMBF-geförderten Forschungsprojekt Aqua-IT-Lab wurde eine Methodik entwickelt, die kleinen und mittleren Betreibern von Wasserver- und -entsorgungsanlagen erlaubt, die IT-Sicherheit ihrer Automatisierungstechnik mit begrenztem Aufwand und ohne tiefes Security-Fachwissen selbst abzuschätzen, um so risikobasiert ressourcenschonend die wichtigsten Umsetzungsschritte planen zu können. Die Methodik lässt sich zudem auf andere Sektoren wie Energie übertragen. |
BibTeX:
@inproceedings{DACH2017, author = {Jörg Schneider and David Fuhr and Edgar Korte and Christof Thim}, title = {Security-Self-Assessment in kritischen Infrastrukturen}, booktitle = {D·A·CH Security}, year = {2017} } |
Linnert B, Schneider J and Burchard L-O (2014), "Mapping Algorithms Optimizing the Overall Manhattan Distance for pre-occupied Cluster Computers in SLA-based Grid environments", In 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014)., 5, 2014. IEEE CS Press. |
Abstract: Grid applications are more and more widely used nowadays. One of the
major challenges is to provide a reliable and predictable platform for computations of various kinds. In order to overcome this challenge, Grid management systems such as the virtual resource manager (VRM) implement scheduling and mapping algorithms at level of the local management systems with support for resource reservation in advance. In this paper, we examine three different mapping algorithms for supercomputers and cluster systems with respect to execution time and the achieveable performance regarding important metrics such as overall Manhattan distance and achievable utilization. The results show the importance of carefully implementing scheduling and mapping algorithms in Grid environments. |
BibTeX:
@inproceedings{Linnert2014a, author = {Barry Linnert and Jörg Schneider and Lars-Olof Burchard}, title = {Mapping Algorithms Optimizing the Overall Manhattan Distance for pre-occupied Cluster Computers in SLA-based Grid environments}, booktitle = {14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014)}, publisher = {IEEE CS Press}, year = {2014}, note = {angenommen} } |
Pfeffer T, Herber P and Schneider J (2014), "Reverse Engineering of ARM Binaries Using Formal Transformations", In The 7th International Conference on Security of Information and Networks. Glasgow, UK, 9, 2014. |
Abstract: Understanding the behavior of a program when no source code is available tends to be a complicated and time-expensive task. In this paper, we present a novel approach for reverse engineering of ARM binaries. The main idea is to translate the original assembler representation into a formal intermediate representation language, namely WSL, and then to apply rephrasing transformations to the code. To achieve a highly modular translation, we define a rule set to translate each assembler instruction individually. Furthermore, new rephrasing rules were developed to recover high level control flow aspects and to eliminate assembler specific program fragments in the intermediate code. We demonstrate the applicability of our approach through the successful recovery of high level control flow statements in the Debian coreutils binaries. Using these example binaries, we studied the performance and the quality of our transformation. |
BibTeX:
@inproceedings{PHS2014, author = {Tobias Pfeffer and Paula Herber and Jörg Schneider}, title = {Reverse Engineering of ARM Binaries Using Formal Transformations}, booktitle = {The 7th International Conference on Security of Information and Networks}, year = {2014} } |
Schneider J and Linnert B (2014), "List-based Data Structures for Efficient Management of Advance Reservations", International Journal of Parallel Programming., accepted., 2, 2014. Vol. 42(1), pp. 77-93. Springer. |
Abstract: Complex eScience and other sophisticated applications in the field
of HPC imply new demands that queuing based resource management systems cannot meet. To guarantee Quality of Service and co-allocation in the Grid, planning based resource management systems implementing advance reservation are needed. These systems face new challenges as a planning based management system has to keep track of the jobs and reservations in the future. Additionally, during the negotiation process of incoming reservations, a good overview of the remaining, not-yet reserved capacity is needed---not only for the current allocation, but also for the whole book-ahead time. Therefore, the resource management problem becomes a two dimensional problem for advance reservations in this field. In this paper different data structures are investigated and discussed in order to fit to planning based resource management. As a result the benefits of using lists of resource allocation or free blocks are exposed. This general idea widely used to manage continuous resources is extended to cover not only the resource dimension but also the time dimension. The list of blocks approach is evaluated in a Grid level and a local resource management system for a computing cluster. The extensive simulations showed a better runtime and higher reservation success rate compared with the currently favored approach of a slotted time and the more sophisticated approach based on AVL trees. |
BibTeX:
@article{Schneider2012, author = {Jörg Schneider and Barry Linnert}, editor = {Utpal Banerjee and Nicholas Carriero and Alexandru Nicolau}, title = {List-based Data Structures for Efficient Management of Advance Reservations}, journal = {International Journal of Parallel Programming}, publisher = {Springer}, year = {2014}, volume = {42}, number = {1}, pages = {77-93}, url = {http://www.user.tu-berlin.de/komm/paper/2012-Schneider-Linnert-data-structures-for-adv.-reservation.pdf} } |
Lell J, Koch S and Schneider J (2013), "StackIDS - Catching Binary Exploits before they Execute a System Call", In Herbsttreffen der GI-Fachgruppe Betriebssysteme. |
BibTeX:
@inproceedings{Lell2013, author = {Jakob Lell and Sebastian Koch and Jörg Schneider}, title = {StackIDS - Catching Binary Exploits before they Execute a System Call}, booktitle = {Herbsttreffen der GI-Fachgruppe Betriebssysteme}, year = {2013}, url = {http://www.betriebssysteme.org/Aktivitaeten/Treffen/2013-Berlin/Programm/docs/lell_koch_schneider-stackids.pdf} } |
Schepke C, Maillard N, Schneider J and Heiß H-U (2013), "Online Mesh Refinement for Parallel Atmospheric Models", International Journal of Parallel Programming., 8, 2013. Vol. 41(4), pp. 552-569. Springer. |
Abstract: Forecast precisions of climatological models are limited by computing
power and time available for the executions. As more and faster processors are used in the computation, the resolution of the mesh adopted to represent the Earthâ??s atmosphere can be increased, and consequently the numerical forecast is more accurate. However, a finer mesh resolution, able to include local phenomena in a global atmosphere integration, is still not possible due to the large number of data elements to compute in this case. To overcome this situation, different mesh refinement levels can be used at the same time for different areas of the domain. Thus, our paper evaluates how mesh refinement at run time (online) can improve performance for climatological models.The online mesh refinement (OMR) increases dynamically mesh resolution in parts of a domain,when special atmosphere conditions are registered during the execution. Experimental results show that the execution of a model improved by OMR provides better resolution for the meshes, without any significant increase of execution time. The parallel performance of the simulations is also increased through the creation of threads in order to explore different levels of parallelism. |
BibTeX:
@article{Schepke2013, author = {Claudio Schepke and Nicolas Maillard and Jörg Schneider and Hans-Ulrich Heiß}, title = {Online Mesh Refinement for Parallel Atmospheric Models}, journal = {International Journal of Parallel Programming}, publisher = {Springer}, year = {2013}, volume = {41}, number = {4}, pages = {552-569}, url = {http://link.springer.com/article/10.1007/s10766-012-0235-4} } |
Koch S, Schneider J and Nordholz J (2012), "Disturbed playing: Another kind of educational security games", In 5th Workshop on Cyber Security Experimentation and Test at Usenix Security 2012. Seattle, US, 8, 2012. USENIX Association. |
Abstract: Games have a long tradition in teaching IT security: Ranging from
international capture-the-flag competitions played by multiple teams to educational simulation games where individual students can get a feeling for the effects of security decisions. All these games have in common, that the game's main goal is keeping up the security. In this paper, we propose another kind of educational security games which feature a game goal unrelated to IT security. However, during the game session gradually more and more attacks on the underlying infrastructure disturb the game play. Such a scenario is very close to the reality of an IT security expert, where establishing security is just a necessary requirement to reach the company's goals. By preparing and analyzing the game sessions, the students learn how to develop a security policy for a simplified scenario. Additionally, the students learn to decide when to apply technical security measures, when to establish emergency plans, and which risks cannot be covered economically. As an example for such a disturbed playing game, we present our distributed air traffic control scenario. The game play is disturbed by attacking the integrity and availability of the underlying network in a coordinated manner, i.e., all student teams experience the same failures at the same state of the game. Beside presenting the technical aspects of the setup, we are also discussing the didactic approach and the experiences made in the last years. |
BibTeX:
@inproceedings{Koch2012, author = {Sebastian Koch and Jörg Schneider and Jan Nordholz}, title = {Disturbed playing: Another kind of educational security games}, booktitle = {5th Workshop on Cyber Security Experimentation and Test at Usenix Security 2012}, publisher = {USENIX Association}, year = {2012}, url = {http://www.user.tu-berlin.de/komm/paper/2012-Schneider-Koch-Nordholz-Disturbed-Playing.pdf} } |
Schneider J (2012), "How do you know that your cloud operator does not cheat?", In Workshop on Service Science and Engineering., accepted. Shanghai, CN Springer. |
Abstract: The security of a system is usually based on the physical security
of the hardware. In a Cloud setup, this basic assumption cannot be assured as the system runs as a virtual machine (VM) on the operatorâ??s hardware. The operator has access to all files, has access to the main memory, can interfere with the communication, and can manipulate the control flow. The Cloud operator can even hide manipulations by creating a virtual view for the user. In the talk, I will show how the security goals confidentiality, integrity, and availability can be violated by the Cloud provider. The user may not be able to prevent such manipulations, but can sign a service level agreement (SLA) and negotiate fines to be paid. For the Cloud operator, the manipulations are no longer lucrative if the risk to be discovered and the fine is high enough. However, a mechanism is needed to detect an attack reliably to enforce the SLA. I will present such detection mechanisms for various attack types and analyze how a bogus Cloud operator may still avoid the detection. |
BibTeX:
@inproceedings{Schneider2012a, author = {Jörg Schneider}, title = {How do you know that your cloud operator does not cheat?}, booktitle = {Workshop on Service Science and Engineering}, publisher = {Springer}, year = {2012} } |
Schepke C, Maillard N, Schneider J and Heiß H-U (2011), "Why Online Dynamic Mesh Refinement is Better for Parallel Climatological Models?", In 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). Vitoria, BR, 10, 2011. , pp. 168 - 175. IEEE Computer Society. |
Abstract: Forecast precisions of climatological models are limited by computing
power and time available for the executions. As more and faster processors are used in the computation, the resolution of the mesh adopted to represent the Earth's atmosphere can be increased, and consequently the numerical forecast is more accurate and shows local phenomena. However, a finer mesh resolution, able to include local phenomena in a global atmosphere integration, is still not possible. To overcome this situation, different mesh refinement levels can be used at the same time for different areas. In this context, this paper evaluates how mesh refinement at run time can improve performance for climatological models. In order to contribute with this analysis, an online dynamic mesh refinement was developed. It increases mesh resolution in parts of a parallel distributed model, when special atmosphere conditions are registered during the execution. The results show that the parallel execution of this improvement provides better resolution for the meshes, without a significant increase of execution time. |
BibTeX:
@inproceedings{schepke2011a, author = {Claudio Schepke and Nicolas Maillard and Jörg Schneider and Hans-Ulrich Heiß}, title = {Why Online Dynamic Mesh Refinement is Better for Parallel Climatological Models?}, booktitle = {23rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)}, publisher = {IEEE Computer Society}, year = {2011}, pages = {168 - 175}, url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6106019}, doi = {10.1109/SBAC-PAD.2011.14} } |
Schepke C, Maillard N, Schneider J and Heiß H-U (2011), "Online Mesh Refinement in Parallel Meteorological Applications", In Proceedings of Latin American Conference on High Performance Computing (CLCAR).
[BibTeX] |
BibTeX:
@inproceedings{schepke2011, author = {Claudio Schepke and Nicolas Maillard and Jörg Schneider and Hans-Ulrich Heiß}, title = {Online Mesh Refinement in Parallel Meteorological Applications}, booktitle = {Proceedings of Latin American Conference on High Performance Computing (CLCAR)}, year = {2011} } |
Schneider J and Linnert B (2011), "Efficiently Managing Advance Reservations Using Lists of Free Blocks", In 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). , pp. 183-190. |
Abstract: Advance reservation was identified as a key technology to enable guaranteed
Quality of Service and co-allocation in the Grid. Nonetheless, most Grid and local resource management systems still use the queuing approach because of the additional complexity introduced by advance reservation. A planning based resource management system has to keep track of the reservations in the future and needs a good overview on the available capacity during the negotiation of incoming reservations. For advance reservation, the resource management problem becomes a two dimensional problem. In this paper different data structures are investigated and discussed in order to fit to planning based resource management. As a result the benefits of using lists of resource allocation or free blocks are exposed. This general idea widely used to manage continuous resources is extended to cover not only the resource dimension but also the time dimension. The list of blocks approach is evaluated in a Grid level and a resource level resource management system. The extensive simulations showed a better runtime and higher reservation success rate compared with the currently favored approach of a slotted time. |
BibTeX:
@inproceedings{schneider2011, author = {Jörg Schneider and Barry Linnert}, title = {Efficiently Managing Advance Reservations Using Lists of Free Blocks}, booktitle = {23rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)}, year = {2011}, pages = {183-190}, url = {http://www.user.tu-berlin.de/komm/paper/2011-Schneider-Linnert-Managing-Adv.-Reservations.pdf}, doi = {10.1109/SBAC-PAD.2011.25} } |
(2011), "INFORMATIK 2011 - Informatik schafft Communities" Bonn, 10, 2011. (192) Köllen Verlag. |
Abstract: INFORMATIK 2011 is the 41th annual conference of the Gesellschaft
für Informatik e.V. (GI). The topic of this year?s conference is ?Computer Science creates Communities? and it?s not only about virtual communities, but also real ones like the scientific community: How can we improve the networking within the computer science community and the connections to politics, industry, and society. How do we use new social media within the scientific communities? But the conference is also about research regarding the new technologies helping communities: From online social networks to software support for huge events or traffic guidance systems. |
BibTeX:
@proceedings{Informatik2011,, editor = {Hans-Ulrich Heiß and Peter Pepper and Holger Schlingloff and Jörg Schneider}, title = {INFORMATIK 2011 - Informatik schafft Communities}, publisher = {Köllen Verlag}, year = {2011}, number = {192}, url = {http://www.user.tu-berlin.de/komm/CD/html/index.html} } |
Diener M, Madruga FL, Rodrigues ER, Alves MAZ, Schneider J, Navaux POA and Heiß H-U (2010), "Evaluating Thread Placement Based on Memory Access Patterns for Multi-core Processors", In Proceedings of 12th IEEE International Conference on High Performance Computing and Communications (HPCC-2010). , pp. 491-496. |
Abstract: Process placement is a technique widely used on parallel machines
with heterogeneous interconnections to reduce the overall communication time. For instance, two processes which communicate frequently are mapped close to each other. Finding the optimal mapping between threads and cores in a shared-memory environment (for example, OpenMP and Pthreads) is an even more complex task due to implicit communication. In this work, we examine data sharing patterns between threads in dierent workloads and use those patterns in a similar way as messages are used to map processes in cluster computers. We evaluated our technique on two state-of-the-art multi-core processors and achieved moderate improvements in the common case and considerable improvements in some cases, reducing execution time by up to 45%. |
BibTeX:
@inproceedings{Diener2010, author = {Matthias Diener and Felipe L. Madruga and Eduardo R. Rodrigues and Marco A. Z. Alves and Jörg Schneider and Philippe O. A. Navaux and Hans-Ulrich Heiß}, editor = {Guerrero, Juan E}, title = {Evaluating Thread Placement Based on Memory Access Patterns for Multi-core Processors}, booktitle = {Proceedings of 12th IEEE International Conference on High Performance Computing and Communications (HPCC-2010)}, year = {2010}, pages = {491--496} } |
Dragiev S and Schneider J (2010), "Grid Workflow Recovery as Dynamic Constraint Satisfaction Problem", In Proceedings of 2010 IEEE Conference on Open Systems (ICOS2010). Kuala Lumpur, 10, 2010. , pp. 74-79. |
Abstract: With service level agreements (SLAs) the Grid broker guarantees to
finish the Grid jobs by a given deadline. There are a number of approaches, to plan reservations to fulfil these deadline requirements and to handle currently running jobs in the case of a resource failure. However, there is a lack of strategies to handle the already planned but not yet started jobs. These jobs will be most likely also affected by the resource failure and can be remapped to other resources well in advance. Complex Grid jobs (Grid workflows) consisting of multiple sub-jobs introduce a higher complexity to determine a remapping saving as much Grid jobs as possible. In this paper a recovery scheme for Grid workflows using a dynamic constraint solver is presented and the gain in the number of saved Grid jobs is evaluated using extensive simulations. |
BibTeX:
@inproceedings{Dragiev2010, author = {Stanimir Dragiev and Jörg Schneider}, title = {Grid Workflow Recovery as Dynamic Constraint Satisfaction Problem}, booktitle = {Proceedings of 2010 IEEE Conference on Open Systems (ICOS2010)}, year = {2010}, pages = {74-79}, doi = {10.1109/ICOS.2010.5720067} } |
Gasmi Y and Schneider J (2010), "E-Mail Security as Cooperation Problem", In Proceedings of Workshop on Systems Communication and Engineering in Computer Science.
[BibTeX] |
BibTeX:
@inproceedings{gasmi2010, author = {Yacine Gasmi and Jörg Schneider}, title = {E-Mail Security as Cooperation Problem}, booktitle = {Proceedings of Workshop on Systems Communication and Engineering in Computer Science}, year = {2010} } |
Köppe F and Schneider J (2010), "Do you get what you pay for? Using Proof-of-Work Functions to Verify Performance Assertions in the Cloud", In Proceedings of International Workshop on Cloud Privacy, Security, Risk & Trust (CPSRT 2010). Indianapolis , pp. 687. IEEE CS Press. |
Abstract: In the Cloud, the operators usually offer resources on a pay per use
price model. The client gets access to a newly created virtual machine and has no direct access to the underlying hardware. Therefore, the client cannot verify whether the Cloud operator provides the negotiated amount of resources or only a fraction thereof. Especially, the assigned share of CPU time can be easily forged by the operator. The client could use a normal benchmark to verify the performance of his virtual machine. However, as the Cloud operator owns the underlying infrastructure, the operator could also tamper with the benchmark execution. We identified four attack vectors to modify the results of the benchmark. Based on these attack vectors, we showed that using proof-of-work functions can disable three of them. Proof-of-work functions are challenge response systems, where it is simple to generate a challenge and verify the result while solving the challenge is compute intensive. We implemented three proof-of-work functions in a prototype benchmark. Experiments showed that the runtime of the proof-of-work functions sufficiently relates to the results of the reference benchmark suite SPEC CPU2006. |
BibTeX:
@inproceedings{Koeppe2010, author = {Falk Köppe and Jörg Schneider}, title = {Do you get what you pay for? Using Proof-of-Work Functions to Verify Performance Assertions in the Cloud}, booktitle = {Proceedings of International Workshop on Cloud Privacy, Security, Risk & Trust (CPSRT 2010)}, publisher = {IEEE CS Press}, year = {2010}, pages = {687}, url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5708518}, doi = {10.1109/CloudCom.2010.100} } |
Schneider J and Koch S (2010), "HTTPreject: Handling Overload Situations without Losing the Contact to the User", In Proceedings of European Conference on Computer Network Defense (EC2ND 2010). , pp. 29-34. |
Abstract: The web is a crucial source of information nowadays. At the same time,
web applications become more and more complex. Therefore, a spontaneous increase in the number of visitors, e.g., based on news reports or events, easily brings a web server in an overload situation. In contrast to the classical model of distributed denial of service (DDoS) attacks, such a so-called flash effect situation is not triggered by a bulk of bots just aiming at hurting the system but by humans with a high interest in the content of the web site itself. While the bots do not stop their attack until told so by their operator, the user try repeatedly to access the site without knowing that the repeated reloads effectively increase the web server's overload. Classical approaches try to distinguish between real user and harmful requests, which is not applicable in this scenario. Simply restricting the number of connections leads to very technical error messages displayed by the users' client software if at all. Therefore, we propose a mean to efficiently block connection attempts and to keep the user informed at the same time. A small subset of HTTP and TCP is statelessly implemented to display simple busy messages or relevant news updates to the end user with only few resources. In this paper we present the protocol subset used and discuss the compatibility problems on the protocol and client software level. Furthermore, we show the results of performance experiments using a prototype implementation. |
BibTeX:
@inproceedings{Schneider2010, author = {Jörg Schneider and Sebastian Koch}, title = {HTTPreject: Handling Overload Situations without Losing the Contact to the User}, booktitle = {Proceedings of European Conference on Computer Network Defense (EC2ND 2010)}, year = {2010}, pages = {29-34}, url = {http://www.user.tu-berlin.de/komm/paper/2010-schneider-koch-HTTPreject.pdf}, doi = {10.1109/EC2ND.2010.7} } |
Dragiev S and Schneider J (2009), "Grid Workflow Recovery as Dynamic Constraint Satisfaction Problem", In Proceedings of 23. PARS Workshop. Parsberg |
Abstract: With service level agreements (SLAs) the Grid broker guarantees to
finish the Grid jobs by a given deadline. There are a number of approaches, to plan reservations to fulfil these deadline requirements and to handle currently running jobs in the case of a resource failure. However, there is a lack of strategies to handle the already planned but not yet started jobs. These jobs will be most likely also affected by the resource failure and can be remapped to other resources well in advance. Complex Grid jobs (Grid workflows) consisting of multiple sub-jobs introduce a higher complexity to determine a remapping saving as much Grid jobs as possible. In this paper a recovery scheme for Grid workflows using a dynamic constraint solver is presented and the gain in the number of saved Grid jobs is evaluated using extensive simulations. |
BibTeX:
@inproceedings{Dragiev2009, author = {Stanimir Dragiev and Jörg Schneider}, title = {Grid Workflow Recovery as Dynamic Constraint Satisfaction Problem}, booktitle = {Proceedings of 23. PARS Workshop}, year = {2009} } |
Gehr J and Schneider J (2009), "Measuring Fragmentation of Two-Dimensional Resources Applied to Advance Reservation Grid Scheduling", In Proceedings of 9th International Symposium on Cluster Computing and the Grid (CCGrid 09). Shanghai, 5, 2009. , pp. 276-283. |
Abstract: Whenever a resource allocation fails although enough free capacity
being available, fragmentation is easily spotted as cause. But how the fragmentation in a system requiring continuous allocations like time schedules or memory can be quantified is hardly analyzed. A Grid environment using advance reservation even combines two-dimensions: time and resource dimension. In this paper a new way to measure the fragmentation of a system in one dimension is proposed. This measure is then extended to incorporate also the second dimension. Extensive simulations showed that the proposed fragmentation measure is a good indicator of the state of the system. |
BibTeX:
@inproceedings{Gehr2009, author = {Julius Gehr and Jörg Schneider}, title = {Measuring Fragmentation of Two-Dimensional Resources Applied to Advance Reservation Grid Scheduling}, booktitle = {Proceedings of 9th International Symposium on Cluster Computing and the Grid (CCGrid 09)}, year = {2009}, pages = {276-283}, url = {http://www.user.tu-berlin.de/komm/paper/2009-measure-2D-fragmentation.pdf}, doi = {10.1109/CCGRID.2009.81} } |
Schneider J, Gehr J, Heiß H-U, Ferreto T, Rose CD, Righi R, Rodrigues ER, Maillard N and Navaux P (2009), "Design of a Grid workflow for a climate application", In Proceedings of IEEE Symposium on Computers and Communications (ISCC'09). , pp. 793. |
Abstract: Grid applications can be modeled as a composition of rather independent
tasks. There are two approaches to define such a workflow either by combining multiple applications to build a more complex functionality or by splitting up an existing application. In this paper we analyze the latter process. We present a compute intensive application for climatology simulation and the options available to split it up. Using the simulation mode of our Grid broker, we were able to compare the different workflow specifications before actually executing the workflows. This case study showed, using finer grained workflows--which usually need more adjustments to the software--allows better performance in the Grid. |
BibTeX:
@inproceedings{Schneider2009, author = {Jörg Schneider and Julius Gehr and Hans-Ulrich Heiß and Tiago Ferreto and César De Rose and Rodrigo Righi and Eduardo R. Rodrigues and Nicolas Maillard and Philippe Navaux}, title = {Design of a Grid workflow for a climate application}, booktitle = {Proceedings of IEEE Symposium on Computers and Communications (ISCC'09)}, year = {2009}, pages = {793}, doi = {10.1109/ISCC.2009.5202233} } |
Burchard L-O, Heiß H-U, Linnert B, Schneider J and Rose CAD (2008), "VRM: A failure-aware Grid resource management system", International Journal of High Performance Computing and Networking. Vol. 5(4), pp. 215-226. |
Abstract: For resource management in Grid environments, advance reservations
turned out to be very useful and hence are supported by a variety of Grid toolkits. However, failure recovery for such systems has not yet received the attention it deserves. In this paper, we address the problem of remapping reservations to other resources, when the originally selected resource fails. Instead of dealing with jobs already running, which usually means checkpointing and migration, our focus is on jobs that are scheduled on the failed resource for a specific future period of time but not started yet. The most critical factor when solving this problem is the estimation of the downtime. We avoid the drawbacks of under- or over-estimating the downtime by a dynamic load-based approach that is evaluated by extensive simulations in a Grid environment and shows superior performance compared to estimation-based approaches. |
BibTeX:
@article{Burchard2008, author = {Lars-Olof Burchard and Hans-Ulrich Heiß and Barry Linnert and Jörg Schneider and Cesar A.F. De Rose}, title = {VRM: A failure-aware Grid resource management system}, journal = {International Journal of High Performance Computing and Networking}, year = {2008}, volume = {5}, number = {4}, pages = {215-226}, doi = {10.1504/IJHPCN.2008.022298} } |
Schneider J, Gehr J, Linnert B and Röblitz T (2008), "An Efficient Protocol for Reserving Multiple Grid Resources in Advance", In Grid and Services Evolution (Proceedings of the 3rd CoreGRID Workshop on Grid Middleware). , pp. 189-204. Springer. |
Abstract: We propose a mechanism for the co-allocation of multiple resources
in Grid environments. By reserving multiple resources in advance, scientific simulations and large-scale data analyses can efficiently be executed with their desired quality-of-service level. Co-allocating multiple Grid resources in advance poses demanding challenges due to the characteristics of Grid environments, which are (1) incomplete status information, (2) dynamic behavior of resources and users, and (3) autonomous resources? management systems. Our co-reservation mechanism addresses these challenges by probing the state of the resources and by enhancing a two-phase commit protocol with timeouts. We performed extensive simulations to evaluate communication overhead of the new protocol and the impact of the timeouts? length on the scheduling of jobs as well as on the utilization of the Grid resources. Keywords: Grid resource management, advance |
BibTeX:
@inproceedings{Schneider2008, author = {Jörg Schneider and Julius Gehr and Barry Linnert and Thomas Röblitz}, title = {An Efficient Protocol for Reserving Multiple Grid Resources in Advance}, booktitle = {Grid and Services Evolution (Proceedings of the 3rd CoreGRID Workshop on Grid Middleware)}, publisher = {Springer}, year = {2008}, pages = {189-204}, url = {http://www.user.tu-berlin.de/komm/paper/2008-efficient-protocol-advance-reservation.pdf}, doi = {10.1007/978-0-387-85966-8_14} } |
Bergmann A, Schneider J and Heiß H-U (2007), "Behandlung offener Netzwerkverbindungen bei Prozessmigration", In Proceedings of 21. PARS Workshop. Hamburg
[BibTeX] |
BibTeX:
@inproceedings{Bergmann2007, author = {Andreas Bergmann and Jörg Schneider and Hans-Ulrich Heiß}, title = {Behandlung offener Netzwerkverbindungen bei Prozessmigration}, booktitle = {Proceedings of 21. PARS Workshop}, year = {2007} } |
Decker J and Schneider J (2007), "Heuristic Scheduling of Grid Workflows Supporting Co-Allocation and Advance Reservation", In 7th Intl. IEEE Intl. Symposium on Cluster Computing and the Grid (CCGrid07). Rio de Janeiro, Brazil, 5, 2007. , pp. 335-342. IEEE CS Press. |
Abstract: Applications to be executed in Grid computing environments become
more and more complex and usually consist of multiple interdependent tasks. The coordinated execution of such tightly or loosely coupled tasks often requires simultaneous access to different Grid resources. This leads to the problem of resource co-allocation. Efficient and robust scheduling algorithms have to be developed that can cope with the Grid's large-scale distribution, a high number of competing and demanding applications, the inherent resource heterogeneity and the often limited view on resource availability. In this paper, we present two heuristic scheduling algorithms that are based on a well-known list scheduling algorithm and both support co-allocation and advance resource reservation. Our first algorithm preserves the run-time efficiency of Greedy list schedulers while the second approach incorporates more sophisticated search techniques in order to achieve better results with respect to the performance metrics. Both algorithms have been implemented within a Grid simulation framework. An extensive simulation study was conducted to evaluate and compare the performance of both algorithms. It showed the general suitability of our enhanced list scheduling heuristics within heterogeneous Grid environments. |
BibTeX:
@inproceedings{Decker2007, author = {Jörg Decker and Jörg Schneider}, editor = {Bruno Schulz and Rajkumma Buyya and Philippe Navaux and Walfredo Cirne and Vinod Rebello}, title = {Heuristic Scheduling of Grid Workflows Supporting Co-Allocation and Advance Reservation}, booktitle = {7th Intl. IEEE Intl. Symposium on Cluster Computing and the Grid (CCGrid07)}, publisher = {IEEE CS Press}, year = {2007}, pages = {335--342}, url = {http://www.kbs.cs.tu-berlin.de/publications/fulltext/decker-heuristicWorkflow.pdf} } |
Burchard L-O, Heiß H-U, Linnert B, Schneider J, Kao O, Hovestadt M, Heine F and Keller A (2006), "The Virtual Resource Manager: Local Autonomy versus QoS Guarantees for Grid Applications", In Future Generation Grids. Vol. 2 |
Abstract: In this paper, we describe the architecture of the virtual resource
manager VRM, a management system designed to reside on top of local resource management systems for cluster computers and other kinds of resources. The most important feature of the VRM is its capability to handle quality-of-service (QoS) guarantees and service-level agreements (SLAs). The particular emphasis of the paper is on the various opportunities to deal with local autonomy for resource management systems not supporting SLAs. As local administrators may not want to hand over complete control to the Grid management, it is necessary to define strategies that deal with this issue. Local autonomy should be retained as much as possible while providing reliability and QoS guarantees for Grid applications, e.g., specified as SLAs. |
BibTeX:
@inproceedings{Burchard2006, author = {Lars-Olof Burchard and Hans-Ulrich Heiß and Barry Linnert and Jörg Schneider and Odej Kao and Matthias Hovestadt and Felix Heine and Axel Keller}, editor = {Getov, Vladimir and Laforenza, Domenico and Reinefeld, Alexander}, title = {The Virtual Resource Manager: Local Autonomy versus QoS Guarantees for Grid Applications}, booktitle = {Future Generation Grids}, year = {2006}, volume = {2}, url = {http://www.user.tu-berlin.de/komm/paper/FGG-local-autonomy-vs-SLA.pdf}, doi = {http://www.springerlink.com/content/m50g77l430705x03/} } |
Schneider J, Linnert B and Burchard L-O (2006), "Distributed Workflow Management for Large-Scale Grid Environments", In IEEE/IPSJ International Symposium on Applications and the Internet (SAINT 2006). Phoenix, Arizona, USA, 1, 2006. , pp. 229-235. IEEE Computer Society Press. |
Abstract: Workflow management in large-scale Grid environments is a very challenging
task centralized management systems are not able to cover sufficiently. Therefore, we present our Workflow On-line Resource Management (WORM) architecture built on top of active network technology. The approach integrates a peer-to-peer like organized workflow management system with existing or newly built management systems for the resources building the Grid. In our approach, each workflow is represented by a mobile autonomous entity which uses the active network infrastructure to move through the Grid, which is represented by an active overlay network on top of existing network infrastructure. Thus, control of the workflow execution is handed over to the autonomous code without requiring a central system to be in charge of the computation and cope with reservation, failures, etc. The WORM architecture is presented together with a classification into the taxonomy of workflow management systems. |
BibTeX:
@inproceedings{BurchardEtAl-2006-Large-Scale-Workflow, author = {Jörg Schneider and Barry Linnert and Lars-Olof Burchard}, title = {Distributed Workflow Management for Large-Scale Grid Environments}, booktitle = {IEEE/IPSJ International Symposium on Applications and the Internet (SAINT 2006)}, publisher = {IEEE Computer Society Press}, year = {2006}, pages = {229--235}, url = {http://www.user.tu-berlin.de/komm/paper/SAINT06-Distributed-workflow.pdf}, doi = {10.1109/SAINT.2006.25} } |
Burchard L-O, Linnert B and Schneider J (2005), "A Distributed Load-Based Failure Recovery Mechanism for Advance Reservation Environments", In 5th ACM/IEEE Intl. Symposium on Cluster Computing and the Grid (CCGrid)., 5, 2005. Vol. 2, pp. 1071-1078. |
Abstract: Resource reservations in advance are a mature concept for theallocation
of various resources, particularly in Grid environments.Common Grid tool kits support advance reservations and assign jobs toresources at admission time. In such a distributed environment, it isnecessary to develop carefully tailored failure recovery mechanismsthat provide seamless transparent migration of jobs from one resourceto another. As the migration of running jobs is difficult, animportant issue in advance reservation, i.e., planning based,management infrastructures is to determine the duration of a failurein order to remap jobs that are already allocated to a currentlyfailed resource but not yet active. As shown in previous work,underestimations of the failure duration and as a consequence theremapping of too few jobs results in an increased amount of jobterminations. In order to overcome this drawback, in this paper wepropose a load-based computation of the jobs to be remapped. Acentralized and a distributed version of the strategy are presented,showing it is not necessary to have knowledge beyond the localallocation on the failed resource. These load-based strategies achieveeffective remapping of jobs while avoiding - inevitably inaccurate -estimations of the failure duration. |
BibTeX:
@inproceedings{Burchard2005b, author = {Lars-Olof Burchard and Barry Linnert and Jörg Schneider}, title = {A Distributed Load-Based Failure Recovery Mechanism for Advance Reservation Environments}, booktitle = {5th ACM/IEEE Intl. Symposium on Cluster Computing and the Grid (CCGrid)}, year = {2005}, volume = {2}, pages = {1071-1078}, url = {http://www.user.tu-berlin.de/komm/paper/CCGrid05-load-based-failure-recovery.pdf}, doi = {10.1109/CCGRID.2005.1558679} } |
Burchard L-O, Rose CAFD, Heiß H-U, Linnert B and Schneider J (2005), "VRM: A Failure-Aware Grid Resource Management System", In Proceedings of the 17th International Symposium on Computer Architecture and High Performance Computing., 10, 2005. , pp. 218-225. IEEE press. |
Abstract: For resource management in Grid environments, advance reservations
turned out to be very useful and hence are supported by a variety of Grid toolkits. However, failure recovery for such systems has not yet received the attention it deserves. In this paper, we address the problem of remapping reservations to other resources, when the originally selected resource fails. Instead of dealing with jobs already running, which usually means checkpointing and migration, our focus is on jobs that are scheduled on the failed resource for a specific future period of time but not started yet. The most critical factor when solving this problem is the estimation of the downtime. We avoid the drawbacks of under- or overestimating the downtime by a dynamic load-based approach that is evaluated by extensive simulations in a Grid environment and shows superior performance compared to estimation-based approaches. |
BibTeX:
@inproceedings{Burchard2005c, author = {Lars-Olof Burchard and Cesar A. F. De Rose and Hans-Ulrich Heiß and Barry Linnert and Jörg Schneider}, title = {VRM: A Failure-Aware Grid Resource Management System}, booktitle = {Proceedings of the 17th International Symposium on Computer Architecture and High Performance Computing}, publisher = {IEEE press}, year = {2005}, pages = {218--225}, url = {http://www.kbs.cs.tu-berlin.de/publications/fulltext/BurchardEtAl-2005-VRM.pdf} } |
Burchard L-O, Schneider J and Linnert B (2005), "Rerouting Strategies for Networks with Advance Reservations", In First IEEE International Conference on e-Science and Grid Computing (e-Science 2005). Melbourne, Australia, 12, 2005. , pp. 446-453. IEEE CS Press. |
Abstract: Network transmissions in high performance networking scenarios, e.g.,
used for e-science or Grid applications, require quality-of-service guarantees concerning bandwidth availability, but also timing constraints, e.g., deadlines, must be met. Current research efforts concentrate on supporting such environments with SLA-aware advance reservation management systems. Hence, the robustness of the management system against network failures is an important issue, especially since failures frequently occur in networks. Since accurate knowledge about the failure duration is unlikely available and estimations lead to considerably degraded performance, in this paper we present a novel load-based approach for dealing with link failures in advance reservation environments. The approach does not rely on prediction of the downtime, but instead reroutes flows only based on available information about the network. |
BibTeX:
@inproceedings{Burchard2005a, author = {Lars-Olof Burchard and Jörg Schneider and Barry Linnert}, title = {Rerouting Strategies for Networks with Advance Reservations}, booktitle = {First IEEE International Conference on e-Science and Grid Computing (e-Science 2005)}, publisher = {IEEE CS Press}, year = {2005}, pages = {446-453}, url = {http://www.user.tu-berlin.de/komm/paper/eScience05-rerouting-of-advance-reservations.pdf}, doi = {10.1109/E-SCIENCE.2005.71} } |
Burchard L-O, Schneider J and Linnert B (2005), "Distributed Workflow Management", In Mitteilungen - Gesellschaft für Informatik e. V., Parallel-Algorithmen und Rechnerstrukturen., 12, 2005. Gesellschaft für Informatik e.V..
[BibTeX] |
BibTeX:
@inproceedings{BurchardEtAl-2005-WORM2, author = {Lars-Olof Burchard and Jörg Schneider and Barry Linnert}, title = {Distributed Workflow Management}, booktitle = {Mitteilungen - Gesellschaft für Informatik e. V., Parallel-Algorithmen und Rechnerstrukturen}, publisher = {Gesellschaft für Informatik e.V.}, year = {2005} } |
Burchard L-O, Linnert B, Heiß H-U and Schneider J (2004), "Resource Co-Allocation in Grid Environments", In Synergies between Information and Automation: 49. Internationales Wissenschaftliches Kolloquium. Shaker. |
Abstract: The co-allocation of different resources is an essential functionality
of resource managementsystems in distributed environments in order to assure deterministic behaviour of thesystem, e.g., for quality-of-service (QoS) guarantees. For example, parallel programs requirethe allocation of resources on several processors. In grid computing environments, the resourcemanagement system needs to fullfil more complex tasks. As grid computing covers alarge variety of different resources and resource types, a job submitted to the Grid may consistof many different sub-jobs which must be accomplished in a coordinated manner in orderto obtain the desired result. For this purpose, guarantees may be given in this case for thecompletion time, e.g., specified as service level agreements (SLA). Besides other tasks, suchas identification and discovery of suitable resources in the Grid, a critical task for the resourcemanagement in such a case is to allocate all of the different resources needed to comply withan SLA. In this paper, the concept of malleable requests for co-allocation is introduced, whichallows a reliable reservation with guaranteed QoS as well as enhanced flexibility for clientsand operators. |
BibTeX:
@inproceedings{Burchard2004c, author = {Lars-Olof Burchard and Barry Linnert and Hans-Ulrich Heiß and Jörg Schneider}, title = {Resource Co-Allocation in Grid Environments}, booktitle = {Synergies between Information and Automation: 49. Internationales Wissenschaftliches Kolloquium}, publisher = {Shaker}, year = {2004} } |