Weighting factor elicitation for sustainability assessment of energy technologies †

In this paper, an approach for sustainability assessment of innovative energy technologies is expanded by multi-criteria decision analysis (MCDA) methods to aggregate indicator results and support decision-making. One of the most important steps for MCDA is to determine weighting factors for individual indicators. Thus, a workshop was performed to elicit weighting factors for sustainability assessments of energy technologies from developers of such technologies and energy system modellers from academia. These stakeholders expressed their preferences with respect to sustainability criteria using the Simple Multi Attribute Rating Technique (SMART). A triple bottom line approach of sustainable development was used as the basis for the aggregation of indicator results. This approach is based on Life Cycle Costing, Life Cycle Assessment and social indicators. Obtained weighting factors were applied to an integrative sustainability assessment with the aggregation method Preference Ranking Organization METHod for Enrichment of Evaluations (PROMETHEE). Hydrogen-based mobility as an important technology to foster decarbonization in the transport sector is used as a case study for the application of the derived weighting factors. A conventional vehicle, powered by fossil fuel, is compared with a fuel cell electric vehicle (FCEV) for the year 2050. Di ﬀ erent options (pipeline, compressed gaseous hydrogen, liquid hydrogen, liquid organic hydrogen carrier) are discussed for the supply of hydrogen. The results for this weighting factor set are compared with an equal weighting scenario of the three sustainability dimensions and indicators within one sustainability dimension. The FCEV, using pipelines for hydrogen supply, came out ﬁ rst in the assessment as well as in all sensitivity analyses.


Introduction
The transformation of the energy system is a prerequisite to meet the goals of the Paris agreement. 1 To enable this transformation process, innovative energy technologies are necessary. Hence, the European Commission pushed forward the European Green Deal, a concept for Europe to become climate neutral by 2050 and to transform the EU's economy in a sustainable manner. 2 In order to analyse greenhouse gas reduction potentials without losing track of other associated effects, a comprehensive sustainability assessment is necessary. Besides other environmental impacts, this should also include economic as well as social impacts. 3 The interpretation of such a sustainability assessment, however, is challenging because many different single results, i.e. indicator results with different units of measure, are obtained, and it is based thereon not possible to propose one unambiguous solution unless one alternative performs best with respect to all indicators. Finding the best alternative including a set of sustainability indicators within a decision-making processes is a very demanding task: not only indicators within one sustainability dimension, e.g. different environmental indicators, must be put in relation to each other, but also indicators referring to different sustainability dimensions, i.e. environmental, economic, and social indicators if using the triple-bottom line concept of sustainability. 4 In order to aggregate single indicator results within the framework of a comprehensive sustainability assessment, mathematical procedures can be used, for which the individual indicators generally need to be weighted in a rst step. In that sense, Multi-Criteria Decision Analysis (MCDA) methods, developed to guide different decision-making processes, can support and facilitate the interpretation of sustainability assessments. MCDA comprises mathematical approaches to cluster a wide number of individual results to less, but better manageable results. 5,6 Already Hannouf and Assefa 7 proposed a framework to couple Life Cycle Sustainability Assessment (LCSA) with a decision analysis. Irrespective of the choice of an MCDA approach, the question of different weighting of sustainability indicators remains. In a rst approximation, equal weighting of indicators can be combined with a sensitivity analysis to identify tipping points of preferences. Other generic approaches deriving weighting factors include context-specic weighting sets 8 or standardized proles. 9 Apart from these generic approaches, MCDA encourages the integration of stakeholders, e.g. technology developers, citizens, political decision-makers, in different stages of the MCDA process, 10 in particular for the determination of weighting factors for criteria and indicators. This is a crucial step in the MCDA process because it allows the integration of stakeholder values of the stakeholders into the decision-making process. Even if no specic weighting factors are used and every indicator is equally important this is still an important decision that should be made consciously.
Our earlier developed approach for sustainability assessment, 11 drawing on the triple-bottom-line approach, offers a comprehensive way to assess energy technologies based on a life cycle perspective. Even though this approach mentions MCDA in order to simultaneously consider the assessed indicators, no recommendations for its implementation are given. In this work, this sustainability assessment is extended using a participatory approach for the determination of weighting factors for sustainability dimensions and respective indicators, involving technology developers and energy system modellers from academia for preference elicitation. The collected weighting factor set is to be used to aggregate the indicator results into an overall performance value within a general framework of sustainability assessment of energy technologies. To account for the triple-bottom-line model of sustainability, a hierarchical weighting approach is obvious, i.e. at rst, indicators of one and the same sustainability dimension are related to each other and second, the three sustainability dimensions are related to each other. In this paper, the resulting set of weighting factors is tested on a rst case study on hydrogen mobility.
The hydrogen mobility comprises several technologies. The basis are hydrogen vehicles, e.g. buses, trucks, trains, passenger cars, mostly powered by a fuel cell, seldomly by an internal combustion engine, which has a lower efficiency compared to a fuel cell. Hydrogen mobility has the chance to be climate friendly when the hydrogen is produced from renewable energy sources, e.g. wind or solar power. 12 It is one technology to enable the electrication of sectors, where direct electrication is not possible. Hydrogen is easier to store in large quantities over long periods than electricity. Furthermore, it is discussed frequently as a corner stone in energy transformation scenarios 13,14 and current German scenarios show that larger shares of green hydrogen in the system can signicantly reduce the import quota of energy. 15 In this case study the focus is on passenger fuel cell electric vehicles (FCEVs), because they are one of the hydrogen technologies that are further developed compared to e.g. hydrogen trains. They emit only water at the point of use and thereforein addition to lowering impacts on climate changehelp reducing emission levels in cities, e.g. particulate matter or nitrogen oxides. 16 However, substantial energy amounts are lost in the hydrogen supply chain during production as well as transport and distribution. 12 Furthermore, trade-offs between mineral resource depletion as well as economic and social impacts occur. 17 Hydrogen refuelling stations are still a technology of concern for many people 18 and currently FCEV are much more expensive in its purchase as well as its operation than comparable vehicles with an internal combustion engine (ICE). 19 Thus, a thorough sustainability assessment of the use of FCEVs is necessary in comparison to a convention gasoline ICE vehicle. For the supply of the hydrogen from renewable sources for the FCEV many different options are available. 12 We decided to focus on supply within Germany, because energy imports have a large political component that are difficult to address. Furthermore, we are only looking at green hydrogen, i.e. hydrogen from water electrolysis, because this is the preferred option for hydrogen production from renewable sources in Europe and Germany. 20,21 Even with these constraints hydrogen supply still has several technology options regarding its transport, which will be part of this case study.
Our previously developed approach for sustainability assessment 11 is used to carry out an indicator-based sustainability assessment of different hydrogen mobility options in Germany. Within the framework of the extended approach, weighting factors for sustainability indicators and dimensions are determined through a stakeholder survey and are used for the aggregation of indicators with the MCDA method Preference Ranking Organization METHod for Enrichment of Evaluations (PROMETHEE). Before the (extended) approach for sustainability assessment is introduced in more detail and is subsequently applied to a case study on hydrogen mobility, the current discussion on weighting factor determination is presented.

State of the art of weighting factor determination
In recent years more and more publications regarding environmental and sustainability assessment discussed the use of MCDA and weighting factors in particular for their nal evaluation. This includes theoretical approaches as well as the integration of stakeholders. Most of the publications follow a hierarchical structure. For the rst hierarchical level a sustainability concept, e.g. the triple-bottom line concept, is used and provided with weighting factors. Then, as a second hierarchical level, each indicator within each sustainability dimension is provided with a weighting factor.
In this section the most important approaches for weighting factor determination are reviewed as a basis for the approach in this paper.
Thies et al. 22 showed in their review how sustainability assessment can be complemented and improved by the use of MCDA and other operational research methods. According to their ndings an important part of MCDA is the way of preference articulation (weighting) which profoundly impacts the entire decision-making process. 23 As an approximation, theoretical approaches can be applied. Haase et al. 24 used such an approach to perform a hierarchical equal weighting for the sustainability assessment comparing different types of passenger vehicles. On the rst hierarchy level all three sustainability dimensions were considered equally important and on the second hierarchy level each indicator within a sustainability dimension were considered equally important. The sensitivity analysis proved that the ranking of the vehicles keeps constant even if the weighting factor for the sustainability dimensions (w i = 0.33) were changed by ±0. 10.
Instead of equal weighting, Ekener et al. 9 proposed to use the stakeholder proles individualist, hierarchist and egalitarian from cultural theory as a guideline for deriving weighting factors for the three sustainability dimensions within sustainability assessment. Since each of the three stakeholder proles have different priorities regarding the sustainability dimensions, with the cardinal ranking approach different weighting factors were derived. For each prole the weighting factors of 0.60 (1st priority), 0.28 (2nd priority) and 0.12 (3rd priority) were assigned to the sustainability dimensions. Ekener et al. 9 applied these weighting factors in a case study regarding four biofuels. In all three proles the most sustainable fuel stayed the same, but rankings of the other three assessed fuels changed when applying a different stakeholder prole.
For specic decision-making situations it can be better to integrate relevant andif possiblerepresentative stakeholders to support a participatory and collaborative decision-making process. In this case, an interface that allows stakeholders to express their preferences regarding the selected criteria is required. 25 In this way, MCDA methods allow to grasp strategic intelligence of a group of stakeholders which bring together different experiences, knowledge, as well as expectations. This offers a way to identify preferences of the group related to a specic problem. Furthermore, MCDA methods also allow to identify if opinions differ and to enable a process to resolve potential discrepancies. 26 There are several forms to reach the participants ranging from interviews, decision conferencing, to online surveys. 23 The selection of the method is dependent on several factors as the number of stakeholders that are involved in the process, available time and money as well as complexity of the problem. Several methods for weighting are available, e.g. methods based on trade-offs (Simple Multi Attribute Rating Technique (SMART)), 27 direct rating, lotteries and pairwise comparisons, e.g. the Analytic Hierarch Process (AHP). 28 While SMART and the AHP are methods to dene weights that are globally applicable, other methods like the conjoint analysis and the discrete choice experiment are based on the respective use cases. 29,30 With respect to environmental indicators, Sala et al. 8 developed a generic weighting factor set for environmental footprint categories. They conducted surveys among different stakeholder groups, i.e. lay people, LCA experts, as well as a workshop with life cycle impact assessment experts. Those results are combined with a robustness factor for each environmental indicator. This weighting factor set includes 16 different environmental indicators, from which climate change is regarded by far as most important with 21.06 (on a scale to 100). Human toxicity, non-cancer has the lowest weighting factor of 1.84, which is heavily inuenced by the included robustness factor.
Tarne et al. 29 supported the sustainability assessment of car manufacturing by asking decision makers in a German automotive company with the help of a limit conjoint analysis. Even though the individual results from the 54 participants had a wide range, the average weighting factors for the three sustainability dimensions were close together (environment w env = 0.352, economy w eco = 0.335 and social w soc = 0.312). The derived weighting factors are representative regarding: (i) geographic location, (ii) time, (iii) subject and (iv) perspective. In this case the representativeness is given for (i) Germany, (ii) the year 2018, (iii) cars and (iv) manufactures. This example illustrates the limited applicability and transferability of results gathered through stakeholder integration into sustainability assessment in general.
The papers mentioned so far were guided by the triplebottom-line approach of sustainability, i.e. environment, economy and society. If a different understanding of sustainability is chosen, results might look much different: for the sustainability assessment of German energy system transformation pathways, Naegler et al. 30 conducted a discrete choice experiment with citizens. Unlike many other studies, sustainability was not assumed as three dimensions, but described by seven indicators, i.e. climate change, human health, resources (land), resources (mineral, metals, fossils), system costs, security of supply and employment. The elicited results were adjusted with a robustness factor leading to the dominating weight of climate change (CC) with w CC = 0.532, which subsequently dened the results of their sustainability assessment. 30 None of the sources mentioned elaborate weighting factor sets for both the sustainability dimensions and corresponding specic indicators within the respective dimensions of our approach. Furthermore, none of the surveys took place in the context of energy system transition in Germany.

Approach for sustainability assessment including weighting factor determination and indicator aggregation
In this paper, the earlier mentioned approach for prospective sustainability assessment of energy technologies according to Haase et al. 11 was extended. It comprises Life Cycle Assessment (LCA) and Life Cycle Costing (LCC) and corresponding indicators as well as social indicators, derived from a normative concept of sustainable development. 11 Furthermore, MCDA was applied for the assessment via two steps, see Fig. 1: (I) preferences, i.e. weighting factors for sustainability indicators and dimensions, were determined through a participatory stakeholder survey and (II) sustainability indicators were aggregated through outranking including the weighting factors from the stakeholder survey. Therefore, the MCDA methods SMART for the stakeholder survey and PROMETHEE for aggregation of sustainability indicators were chosen. All methods used are further described in the following subsections. Lists of the considered criteria and indicators can be found in Table 1. In this paper, a sustainability criterion might contain several indicators describing the criterion.

Environmental assessment
For the environmental assessment an LCA was carried out. With an LCA two or more alternatives of a product, service or organisation can be assessed concerning the potential impact on ecosystems, human health and resources. The premise of this methodology is that environmental impacts are not limited to the production process itself (foreground). They may also occur in the background, i.e. pre-chains. This includes for example electricity generation or steel production. Here an LCA following ISO 14040 and 14044 was carried out: aer goal and scope denition, the so-called Life Cycle Inventory (LCI) is build-up, which includes all resource consumption and emissions along the value chains under consideration. Based thereon, Life Cycle Impact Assessment (LCIA) was carried out and results were interpreted. In the case study (see section Case study: hydrogen mobility), the upstream and downstream processes, e.g. raw materials supply, provision of operating materials and infrastructure, waste and wastewater disposal, and product use, were modelled via the open source soware openLCA v1.7 together with datasets from the commercial ecoinvent database v3.3 (cut-off-system model). 31 As far as possible, specic datasets for Germany (DE) were used. If no datasets were available for Germany, datasets for Switzerland (CH), Europe (RER), or worldwide datasets (GLO) were used. For LCIA, 13 environmental impact categories and corresponding indicators were applied at midpoint level as recommended in the ILCD Handbook of the European Commission 32 and implemented in the soware openLCA (LCIA methods v2, ILCD 2011, midpoint). 33

Economic assessment
For the economic assessment, LCC was used. LCC aims at assessing all costs related to a product over its entire life cycle, i.e. purchase, use, disposal and recycling, respectively. 34 Goal and scope denition is similar to that of an LCA. If LCC is used in parallel with LCA, system boundaries of LCC need to be equivalent to system boundaries of LCA and identical functional units should be used. 34 Different parts of the product system may fall below relevant cut-off criteria for the separate LCC and LCA components. For example, early research and development may impose signicant costs but little environmental impact. 34 As cost data may be gathered in different currencies and reect different time periods, economic inventory data needs to be adjusted to a common currency and reference year using appropriate exchange and discount rates. 34 There is no comparable impact assessment phase in an LCC, because all inventory data comprise a single unit of measure, namely currency. 34 Procedures for interpretation, communication, and review are analogous to those for LCA. If LCC analysis  are carried out from the user's perspective, the term "total costs of ownership" (TCO) is commonly used. 35 According to VDI, 36 life cycle costs can be divided into the three stages "before utilization", "during utilization," and "aer utilization" or three different costs types "Capital Expenditures" (CAPEX), "Operational Expenditures" (OPEX), "End Of Life Expenditures" (EOLEX). Within this study, levelized total costs (LTC) was used as an economic indicator, cf. Haase et al., 11 comprising CAPEX and OPEX. Furthermore, in this LCC no external costs, taxes, or subsidies were considered. 37

Social assessment
For the social assessment, three criteria, i.e. acceptance, domestic value added and innovation potential and corresponding indicators were considered based on Haase et al. 11 Since these indicators are not standardised yet, their background and assessment are described in more detail below.
Acceptance. The acceptance of energy technologies is a prerequisite for the development and application of the same. 38 According to Assefa and Frostell,39 there is an association between what the public feels and thinks about a technology and its knowledge and they emphasize on three indicators, i.e. knowledge, perception, and fear for the assessment of social acceptance. Depending on different factors, e.g. depth and type of information needed, representativeness, or constraints regarding time and costs, different methods can be chosen for data collection. These methods encompass for example in-person interviews or web-based questionnaires. 39 In this approach, acceptance is investigated using an online survey based on the methodological background of Huijts et al. 40 and Miguel et al. 41 Aer a short description of the respective technology, nine different types of concerns are put to selection, including an open eld for further concerns. Furthermore, socio-demographic data is collected regarding gender, residence, income, activity, age, and education. The freely available online platform SoSci-Survey was used for conducting the survey. A more detailed description of the approach can be found in Emmerich et al. 42 Due to practical issues this indicator cannot be considered in this study.
Patents as indicator for innovation potential. Patent-based indicators can serve as a proxy for the innovation potential. 43 This indicator guarantees protection of company knowledge but can also provide information about environmental benets and produced social well-being 44 as well as promotes social change. 45,46 Here, a combination of patent-based indicators is used based on Baumann et al.; 43 (i) patent intensity patenting activity in a certain technology eld per country and respective GDPfor the GDP the purchase power parity (PPP) adjusted form was used; (ii) patent growth rate (in %) and (iii) national technology share (relative R&D emphasis of a country related to a single technology). The patent intensity mirrors the national patenting activity in relation to a country's economic growth. A high patent growth rate is interpreted as a high innovation potential due to increased research effort in the area, whilst a high national technology share indicates a strong focus of R&D of a country. 47 The European Patent Office (EPO) database including the Open Patent Service (OPS) was used for patent analysis. For patent search and data analysis, an adopted and freely available python-based patent database crawler together with an MS Excel template was used. 43 The patent search was carried out for the time period from 1995 to 2018 using Cooperative Patent Classication (CPC)-codes and keywords (see ESI  Table 1 for details †). Based on the results, the template selected the ve most active patenting countries in the considered technology eld and compares them to Germany. The different patent indicators are presented in a portfolio analysis that includes the relevant countries for the selected technology.
The goal of these indicators is to assess the innovation potential that a technology has for a specic country. Thus, in a comparison between different technologies a higher innovation potential is considered better. In this study, the focus was on Germany. To quantify the innovation potential in a country comparison, the different indicator results are ranked from 1-5 (R i , the lower the better). For each country under consideration the mean rank of the indicators (n) is calculated R (eqn (1)). Here, three indicators were analysed, thus n = 3.
Domestic value added. Assuming that local investment, i.e. locally produced goods, creates or secures local jobs, the fraction of domestic value added was used here as an indicator for local job creation potential. 11 This indicator aims to give an indication whether the energy technology might have a positive effect on job development compared to another (conventional) technology. In this study, the fraction of domestic value added was based on methods, data and results from the economic assessment. 48 System boundaries, therefore, correspond to those of LCC. For each cost component, considerations were made on the fraction of domestic value added, including information on the country's raw material deposits, located industries and their capabilities. If necessary, cost components were further specied, i.e. the technology under consideration is split up into its modules and all life cycle phases. 48 Then, percentages of domestic value added were estimated for each cost component and summed up to the fraction of domestic value added of total costs. 11,48 A detailed description of the method can be found in Harzendorf et al. 48 Multi-criteria decision analysis. MCDA methods allow to organize available information, to explore perceptions and needs, and to identify consequences of a decision to support involved decision makers. 49 Decision problems are expressed in form of equations, the simplest form and widely known approach would be the weighted sum, inputs, e.g. LCA results, and coefficients, e.g. weighting factors, which can be observed and reproduced. Depending on the problem at hand, stakeholder integration can, should or must be done. In general, MCDA can be divided into two major sequences, which can be overlapping, and which are characterized by an iterative process as follows: 10 (1) Denition of the decision problem, alternatives, and corresponding criteria and indicators as well as selection of suitable (MCDA) methods, preferably in cooperation with stakeholders (2) Aggregation of indicator or criteria results using weighting factors, preferably from stakeholders, and suitable MCDA methods, corresponding interpretation and analysis of aggregation results and decision process.
Within the second step, it is highly important to select suitable methods that t to the given type of problem. One wellknown method is the weighted sum approach, which is very easily manageable, but which lacks to cover more complex decision-making contexts. In contrast, elaborated MCDA approaches like Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) or PROMETHEE or ELimination Et Choix Traduisant la REalité (ELECTRE), can address more complex problems, for example including thresholds for indicator values. Each of these methods has its strengths and weaknesses and should be selected with care for each assessment. 50 Within this approach the MCDA procedure according to Fig. 1 is applied: (I) Determination of weighting factors with an participatory approach: SMART. 27 The advantage of SMART is its general low complexity. At the same time the chance of bias is less pronounced compared to other approaches with low complexity. 51 (II) Aggregation of indicator results: PROMETHEE. 52 Indicator compensation should be avoided. Easiest application for a complex outranking method. 53 SMART. The execution of SMART may differ. 27 For the approach here, the emphasis lies on simplicity for the included stakeholders. Thus, the following procedure was chosen. The most important indicator receives 100 points. Subsequently, the other indicators are assigned scores, which express the relative importance to the most important criterion. This results in a score range of 0 to 99 points for the remaining indicators. Subsequently, the scores of the individual indicators are then added up to give a total score. The weighting values are then derived from the relative proportion of the total score. The calculation is set out in eqn (2).
with w i = weighting value in percent for the respective criterion, P i = score of the respective criterion. 27,54 PROMETHEE. PROMETHEE is a family of outranking (A is better than B, not concerning how much better A is compared to B) methods developed by Brans and colleagues in the early 1980s. 52,55 Since then, several variations of this method have been developed. This includes PROMETHEE I-VI. 56 The most relevant variations are PROMETHEE I and II, which will be used in this paper.
The principle of PROMETHEE is based on a pairwise comparison of alternatives along each criterion. PROMETHEE I gives a partial ranking of the alternatives with the outranking ows F + and F − . The higher F + and the lower F − are, the better is the overall rank of the analysed option. However, this can also lead to incomparabilities when F + and F − indicate different preferences. PROMETHEE II adds a step to derive a complete ranking of the alternatives (outranking ow F) by calculating the difference between the two ows. This leads to a complete ranking under some degree of detail loss. 57 To account for very small differences in the pairwise comparisons, preference functions are introduced. They translate the difference between the indicator results obtained by two alternatives into a preference degree ranging from zero to one. In this way, the users can implement their opinion on what preference actually means (A is considered only better as B, when A is at least, e.g., 5% different from B). Based on the analysed indicator, different functions have their purpose regarding the level of uncertainty and the nature of the values, i.e. qualitative, discrete or continuous. 58 Implementation. The implementation of SMART and PROMETHEE is presented in the following two sections.
Participatory approach for deriving weighting factors using SMART. Researchers and developers of technologies, with a focus on the energy transition to a low carbon future, were asked on their opinion about the importance of sustainability criteria for the assessment of emerging technologies, e.g. biomass, electricity transmission, energy storage and hydrogenbased technologies in the context of the German energy transition. The survey took place during a hybrid (onsite/online) conference with 89 attendees in September 2020. During a 50 minutes-long session the basics of the SMART method were presented as well as a short recap of the criteria, which were included in the survey. For each criterion the inuencing factors, e.g. acidifying emissions, the impact pathways, e.g. lowering of the pH-value in the soil, and the nal impacts, e.g. decrease in biodiversity, were illustrated. For this introduction of criteria it was considered that the audience came from academia with a technical/natural science background. Due to some technical issues, not all onsite attendees were able to participate. Aer each presentation the attendees were asked to participate in an online survey. In total three surveys were conducted.
At the rst hierarchy level, a weighting procedure based on the triple bottom line approach of sustainability 4 was chosen. Participants were invited to give a weight to each of the three dimensions. They were asked 'How important do you consider the following sustainability dimensions (economy, society, environment)?'. The second hierarchy level contains the indicators within each dimension of sustainability. As the economic dimension only contains one indicator, see subsection economic assessment, no further survey was necessary. Questions for environmental and social criteria were asked accordingly to the sustainability dimensions. It should be noted that this weighting factor elicitation was carried out independently from a specic energy technology or case study and is therefore applicable to other energy technologies.
To distinguish between the weighting factors of criteria within one of the three sustainability dimensions, i.e. environment, economy, and society, and the nal weighting factors of all criteria, they are from here on forward referred to as local weighting factors, e.g. w local,env,CC , and global weighting factors, e.g. w global,CC . The graphical representation is depicted in Fig. 2. The global weighting factors (w global,j ) (with j representing the different sustainability criteria, e.g. climate change (CC)) are the product of the local weighting factor (w local,ij ) (with i representing the three sustainability dimensions) and the weight of the respective sustainability dimension w Sus,i (eqn (3)).
with P n j w global;j ¼ 1; P 3 j w Sus;i ¼ 1 and P n j w local;ij ¼ 1 Following the approach for sustainability assessment described above, the environmental assessment with LCA included 13 indicators. Even though the SMART method allows the comparison of a higher number of aspects, asking stakeholders for the relative evaluation of 13 different indicators was assumed to be overwhelming the cognitive demand of the participants. 59 Thus, the 13 indicators were clustered to eight environmental criteria. It was assumed that the indicators within one criterion are equally important. The indicators ionizing radiation, human toxicity, cancer and human toxicity, non-cancer were clustered to one criterion "Human toxicity". Furthermore, the three indicators eutrophication, freshwater, eutrophication, marine and eutrophication, terrestrial were clustered together with ecotoxicity into the criterion "ecotoxicity". To derive the weighting factors for the 13 indicators based on the survey of the eight criteria, in a rst step, all thirteen indicators within one criterion received the weight of the criterion (environmental indicator criterion weight (w env,cj )). This violates the condition P w j = 1 and results in a sum larger than one. Thus, in a second step, each environmental indicator criterion weight (w env,cj ) was divided by the summation of the indicator criterion weights (eqn (4)).
This resulted in the local environmental weighting factors. Denition of the preference functions for PROMETHEE. In this paper, the values of the discussed criteria are continuous as well as discrete. For continuous results the linear preference function is most appropriate which requires the denition of thresholds q (indifference value) and p (preference value). 60 As innovation potential gives discrete values as result, i.e. technology A is better than technology B, the usual preference function is sufficient, and no thresholds need to be dened. As thresholds q and p for the environmental indicators (index i) 5 and 10% of the minimum value across all analysed options (A, ., N) were chosen, respectively, to account for any uncertainty, 58 see eqn (5).
For the indicators levelized total costs and domestic value added, thresholds were lowered to 1 and 2% of the minimum value due to less uncertainty of the assessment methods. For acceptance only results on hydrogen refuelling are available and results on the ICE with gasoline are missing. It is worth mentioning, that there are studies on the topic as e.g. Brunner et al. 61 However, the results are not comparable to Emmerich et al. 42 as every acceptance study is carried out in a specic context (spatial, technology and aim) using different methods. Thus, acceptance could not be included in this case study and no thresholds needed to be dened.
For the execution of PROMETHEE the open access soware visual PROMETHEE was used.

Results survey on weighting factors
First, the results for the survey on sustainability dimensions is presented followed by the results from the surveys on environmental and social criteria.

Weighting of sustainability dimensions
In total 60 answers were received for the survey on sustainability dimensions. The participants voted the environmental dimension of sustainability most important with a resulting weighting factor w Sus,env of 0.385, Fig. 3. This is a signicant higher value compared to equal weighting, i.e. w Sus,i = 0.333, and by far greater than for the next dimension social with w Sus,soc = 0.320. Economy ranked lowest among the three sustainability dimensions with w Sus,eco = 0.295. The results of all three dimensions show a certain degree of variance over the 60 answers from the participants. With a standard deviation (m Sus,env ) of 0.060 the weighting factor for the environmental dimension has the narrowest distribution. The opinions of the participant were more widely spread regarding the importance of the social dimension (m Sus,soc = 0.069) like the spread for the  economic dimension with m Sus,eco = 0.068. As mentioned above, only one economic indicator is considered, thus, w local,eco = 1 and w global,eco = w Sus,eco = 0.295.

Environment
For environmental criteria, 48 answers were received within the survey. The most important environmental criterion for the researchers is climate change (w CC = 0.169, P w i = 1). At the same time, however, it also has the largest interquartile range (0.050), meaning that the votes from the participants are widely spread, see Fig. 4. The participants consider photochemical ozone formation (POF) least important (w POF = 0.099) closely followed by acidication (w Acid = 0.102). To derive the local environmental indicator weights (w local,env,j ) based on the results of the environmental criteria, eqn (4), was applied. All results are listed in Table 2.

Social
The survey regarding the social criteria was answered by 55 participants. They consider the acceptance of emerging technologies by the general public as the most important social criterion. Domestic value added and innovation potential, received rather similar weighting factors, which are slightly lower than for acceptance, see Fig. 5. As for the subsequent assessment of the case study on hydrogen mobility the criterion acceptance was not considered due to the lack of data for the fossil reference. Thus, the number of criteria is reduced from three to two for the calculation of local weight and global weight, see Table 3. For the calculation of local and global weighting factors, the same method is applied as for the environmental criteria in section Implementation, see eqn (3) and (4). The resulting weighting factors for social indicators are listed in Table 3.

Case study: hydrogen mobility
The weighting factors derived from the participatory approach are tested for the rst time on a case study on hydrogen mobility. Aer describing how hydrogen mobility is modelled, the detailed indicator results are presented. Finally, the MCDA is performed with the derived weighting factors to guide the decision process regarding the different hydrogen and fossil mobility options.

Modelling of hydrogen mobility
In this case study, a passenger FCEV was analysed in detail with different hydrogen supply chains and compared to a conventional ICE vehicle fuelled with gasoline. The boundary conditions were set for Germany for the year 2050. Further background information, e.g. electricity mix and costs, are taken from the Helmholtz-Alliance ENERGY-TRANS. 62 The two different vehicle types have the same capacity of 100 kW. While the ICE vehicle was based on a VW Golf 63 the FCEV is based on a Toyota Mirai, 64 downscaled to 100 kW. More detailed information regarding the modelling of the vehicles can be found in Haase et al. 24 The hydrogen was produced in an alkaline water electrolyser 65 using wind power as an energy source. The LCI modelling of the wind was based on Schreiber et al. 66 For transport and distribution of hydrogen, different technologies are available. Currently, the most common transport methods are gaseous hydrogen in high-pressure tanks (compressed gaseous hydrogen CGH 2 ) and liquid hydrogen (LH 2 ) in cryogenic tanks by truck. Alternatively, hydrogen storage and transport in liquid organic hydrogen carriers (LOHCs) by truck was considered. The fourth alternative analysed is the construction of a new hydrogen pipeline network in Germany.
To be not susceptible to wind power uctuations, the hydrogen needs to be stored, if necessary, for months. Therefore, for gaseous hydrogen seasonal storage in salt caverns was considered. Liquid hydrogen, as well as hydrogen in LOHCs, can be stored in appropriate tanks. The most important technical parameters are summarized in Wulf et al. 67 An overview of the system boundaries and the different process steps of the four FCEV options (CGH 2 , pipeline, LH 2 , LOHC) and the ICE vehicle option (gasoline) are depicted in Fig. 6.

Indicator results
The sustainability assessment of hydrogen mobility was based on earlier publications on LCA, LCC and selected social indicators. 43,48,[67][68][69] Life Cycle Assessment of hydrogen mobility. The functional unit of the LCA was one driven vehicle kilometre. The results are depicted in Fig. 7 relatively to the conventional ICE vehicle fuelled by gasoline. In four impact categories, hydrogen mobility has clear advantages compared to the fossil ICE vehicle. Next to the categories climate change and resources, whichof course -benet from the switch from mineral oil to wind power as primary energy source, ionizing radiation and ozone depletion show much lower results for hydrogen-based options. For ionizing radiation, the much larger impacts for the ICE vehicle are related to the gasoline production (over 60%), where low level radioactive waste is accumulated. Regarding ozone depletion, over 90% of this impact is caused by direct bromotriuoromethane emissions during petroleum production. Bromotriuoromethane also known as Halon 1301 is a re suppressing agent. However, hydrogen-based options show in three categories much higher impacts than the gasoline ICE vehicle. Construction of FCEV demands more mineral resources than conventional ICE vehicles distributed over various components, leading to more waste ows from mining processes. In particular, phosphate emissions to the ground occur during the treatment of tailings leading to high impacts This journal is © The Royal Society of Chemistry 2023 Sustainable Energy Fuels for eutrophication, freshwater. These emissions are, for example, responsible for 55% of this impact category for the pipeline option and are 90% higher than the emissions for the ICE vehicle. Regarding human toxicity, cancer, the higher steel demand in the FCEV leads to chromium IV emissions to the ground during slag treatment from steel production. For human toxicity, non-cancer,like for eutrophication, freshwatera high impact comes from the treatment of tailings, which account for 54% of the overall impact human toxicity, non-cancer, for the pipeline option. For this impact category mainly arsenic ion and zinc ion emissions to water are the cause for the high impact. The four hydrogen-based options have, compared to the gasoline vehicle, rather similar results with a small preference for the hydrogen supply chain using pipelines.
Life Cycle Costing of hydrogen mobility. Analogous to LCA, the functional unit of LCC was one driven kilometre. In this study, capital costs refer to levelized costs of car acquisition without VAT, consumables (fuel supply costs) and other operating costs (maintenance and repairs, insurance). Fuel supply costs comprise production costs of fuel as well as costs for transport, storage, and service stations. The results for the LCC of hydrogen mobility are depicted as absolute numbers in Fig. 8. Under the assumption that FCEV will achieve the same purchase cost level as a similar ICE vehicle in 2050, 19,70 the gasoline ICE vehicle will have slightly higher costs than the hydrogen-based vehicles. Even though gasoline is less expensive than hydrogen supply, the higher costs for maintenance of an ICE 19 offset this advantage. Taking a closer look at the different options for hydrogen supply results for CGH 2 and pipeline supply show equally low costs. Even though a hydrogen refuelling station for liquid hydrogen and transport of liquid hydrogen requires less cost than gaseous hydrogen, costs for liquefaction nullify these cost advantages. The LOHC transport technology is the least advanced hydrogen transport option. From today's view, also with considered technological and economic improvements the LOHC option will be more expensive than the other hydrogen transport optionsfor this case study. However, it can keep up with the gasoline ICE vehicle. A more detailed discussion of the results for hydrogen mobility, e.g. hydrogen supply cost, can be found in Haase et al. 11 Social assessment of hydrogen mobility. Due to the lack of comparable quantitative results for the indicator acceptance, here only the results for the indicator domestic value added and the criterion innovation potential with its indicators are presented. However, for more results and methodology of the indicator and criteria acceptance the articles by Emmerich et al. 18 and Baur et al. 71 are suggested.
Domestic value added. It is assumed that 64% of vehicles are produced domestically and that value added of costs for labour, car maintenance and repairs as well as car insurance are fully domestic. Relative results using the gasoline ICE vehicle as  a reference are presented in Fig. 9. As this analysis is based on the LCC it is not surprising that also for this indicator, results lay very close together. It is safe to say that mobility with the gasoline ICE vehicle is less likely to produce domestic value added than the FCEV options due to the higher domestic value added of the hydrogen production compared to gasoline production. The results of the hydrogen-based options are very close together and with the underlying assumptions 48 it is not possible to make a distinction between the analysed options. Due to the hydrogen production in Germany, i.e. no import, and only national transport and distribution, all four options have a high share of domestic value added (>75%).
Innovation potential. To assess the innovation potential for all technologies related to hydrogen mobility options, is very time consuming. Thus, here only a spotlight is put on fuel production. For the FCEV options that means that patents for alkaline water electrolysis are analysed, whereas for the gasoline ICEV patents regarding gasoline reneries are taken under consideration.
The results for the patent-based indicators are displayed in Fig. 10, via a portfolio analysis, which combines the three selected patent indicators. The patent intensity is displayed on the x-axis, while growth rates are shown on the y-axis. The National technology share of a country is indicated through the size of the bubbles.
For the alkaline water electrolysis, Japan has the highest patent intensity combined with a comparable high patent growth rate and a high emphasis on alkaline water electrolyzes compared to all patents (high National technology share) in Japan (Fig. 10, Table 2 in the ESI †). This reects the important position of hydrogen in Japan's way to decarbonization. In contrast, regarding gasoline reneries, Japan is not one of the ve most important countries. Only China and the US show patent activities for both technologies. China has, for both technologies, a higher patent intensity than the US and a higher National technology share. This is in line with the ndings for other technologies that in China, in general, a lot of patents are led. 43 Germany also shows patent activities for both technologies within the top ve countries, but not on a comparable level with the leading countries for each technology. For more details on the patent analysis of alkaline water electrolysis, refer to Baumann et al. 43 As can be seen, Germany is neither for alkaline water electrolysis nor for gasoline renery the leading country. However, for alkaline water electrolysis Germany reaches a mean rank of 3.3, while for gasoline reneries only a mean rank of 4.7 is achieved ( Table 3 in the ESI †). Thus, the innovation potential for alkaline water electrolysis can be considered higher as for gasoline reneries.

MCDA results
Aer discussing results from the individual indicators, in a next step, these results are the basis for the MCDA. They are combined with the preferences of technology developers from academia, see section Results survey on weighting factors, to provide clear guidance regarding the sustainability of gasoline and hydrogen mobility.  According to the PROMETHEE II approach described above, the outranking of the different mobility options is displayed in Fig. 11. The gure shows the results comparing the weighting factors derived from the technology developers with weighting factors based on hierarchical equal weighting (ESI Table 4 †). The higher the outranking ow F for a technology option is, the more sustainable this option is. Detailed results for PROM-ETHEE I and II are listed in Table 5 in the ESI. † Both weighting factor sets support the same conclusion: FCEVs supplied with hydrogen from a pipeline system is more sustainable than the other hydrogen options and the gasoline ICEV is less sustainable than an FCEV.
The LOHC supply chain of hydrogen results in the highest costs and for several environmental indicators in the worst results. Thus, in the overall sustainability ranking this option comes last among the FECV options. The pipeline supply chain of hydrogen convinces mainly through good environmental results. The gasoline vehicle shows favourable results in some environmental indicators, i.e. ecotoxicity, Human toxicity (cancer and non-cancer) and eutrophication, freshwater. However, this cannot offset the higher costs.
As can already be deducted from the PROMETHEE results for the two weighting factor sets, the sustainability ranking of the different technology options is rather robust. When taking the technology developer weighting factor set and varying only one weighting factor from the environmental indicators at the time, the weighting factor for eutrophication, freshwater changes the rst. However, its global weight must change from 0.03 to 0.15 to induce a change of the sustainability ranking, which is a rather large change. At that point the LH 2 FCEV would become more sustainable than the CGH 2 FCEV and step up to the second most sustainable option aer the pipeline option. Regarding the weighting factor for the levelized costs, a change in the sustainability ranking would not occur when the weighting factor gets higher. In contrast, if the weighting factor of the costs decreases to 0.16 the LH 2 FCEV becomes more sustainable than the CGH 2 FCEV.
As described in section Results survey on weighting factors the results regarding the weighting factors for the three sustainability dimensions showed some variations across the 60 participants. To show the impact of the deviation between the participants, here three weighting factor sets with the maximum values for the social, environmental and economic dimension, respectively, are discussed ( Table 4). As no connection between the answers of the survey regarding sustainability dimensions and the surveys regarding single criteria can be made, the results from the other surveys are kept constant. Resulting weighting factors for indicators are listed in Table 6 in the ESI. † The sustainability rankings of these new weighting factor sets are displayed in Table 5. The weighting factor set with the highest weighting factor for the social dimension (max social) shows no difference in the sustainability ranking, even though the environmental dimension weights much less than in the average weighting factor set. For the extreme weighting factor set preferring the environment (max environment) a small difference can be detected in the rankings, the CGH 2 option ranks now worse than the LH 2 option due to its more severe environmental impacts. When putting the economic dimension of sustainability rst again (max economy), the ranking does not change. However, only 10% of the participants consider the economy as the most important sustainability dimension. Whereas 62% of the participants put the environment rst and 28% the social dimension.  The PROMETHEE outranking ows for these extreme weighting factor sets are listed in Table 7 in the ESI. †

Discussion and conclusions
It was the goal of this paper to extend an existing approach for sustainability assessment 11 of energy technologies with MCDA to come to an integrated sustainability assessment. To achieve this goal, the MCDA method PROMETHEE was chosen for the aggregation of indicator results due to its relatively easy application for an outranking method. Furthermore, weighting factors for sustainability dimensions and criteria/indicators were elicited from energy technology developers and energy system modellers from academia for the assessment of innovative energy technologies in the context of Germany. This extended approach was aerwards tested on a case study assessing hydrogen mobility in comparison to gasoline mobility for individual transport. Furthermore, the elicited weighting factors were compared with an equal weighting of sustainability dimensions and criteria/indicators. Additionally, the inuence of using extreme weighting factors were discussed with the help of the case study results.
During this work several issues were identied that need to be addressed when using MCDA for sustainability assessment. These include the procedure for weighting factor determination, the necessity to expand the indicator set, the integration of other stakeholders and specics regarding the case study on hydrogen mobility, which will be discussed in the following.

Weighting factor determination
As mentioned above, dening weighting factors is one of the most important steps in MCDA, but also a challenging procedure that has to be carried out carefully and in a transparent way. The method SMART was used for the elicitation of weighting factors because of its relatively easy procedure. Independent of the chosen method, one of the challenges are limitations of the cognitive spans (memory span, perceptual span etc.) of the participants. This is elementary as only a limited number of distinctions can be grasped by participants at once as a base of making judgements. 51,59 Even though this was taken under consideration for the design of this study with the stakeholders in the workshop, a decrease in participants from the rst survey (n = 60) to the last survey (n = 48) was evident. The last survey was about the environmental criteria and was the most complex one with eight different environmental criteria. The highest effort was put into this last survey to provide adequate background information about the environmental criteria and the used scale before eliciting corresponding weighting factors. Therefore, the risk that stakeholders might have not properly understood the criteria or interpreted them differently should be minimized. 72 Informing the participants about the different criteria so that they can make an informed decision was the most time consuming part of the stakeholder workshop. At the same time, this part of the procedure is most prone to bias due to the way the information is presentedeven if it is done subliminally. In contrast, the opinions of the participants might be preconceived by discussions in the media. 73 Furthermore, as the creators of the survey have a strong background in LCA there might be a bias towards the presentation of environmental indicators and criteria. To minimize the risk of inuencing the participants in their opinion towards the environmental criteria the survey about the sustainability dimensions was held before each environmental criterion was presented. For future surveys it might be helpful to either present the different criteria by different people, each of them experts in the presented criteria or by one person, who is objective towards all the presented criteria. Despite these challenges we were able to generate a meaningful weighting factor set for sustainability assessment of energy technologies.

Indicator expansion
Even though dening and implementing social/socio-economic indicators is challenging, it is of utmost importance to extend these. In particular, the criterion acceptance, which is oen discussed in the media, needs to be developed into a functioning indicator that can be used for comparing different technologies. This is also stressed by the fact that acceptance was regarded as most important by the stakeholders for the implementation of innovative energy technologies. In line with acceptance, other criteria and indicators should be developed, e.g. security of energy supply, to make this sustainability assessment approach more holistic. Introducing such new criteria would imply a new stakeholder survey though. Also, we recommend a monitoring of the weighting factors every two to ve years due to the temporarily limited validity of the same. This limitation stems from a changing world with hardly predictable conicts and disasters, making it necessary to include other/more criteria. In line with this, also the perception of the already existing criteria might change.

Stakeholders integrated
An extension of the considered stakeholders beyond academia should be considered in future works to consider the heterogeneity of societal perspectives on the relevance of different aspects. Such an extension could also lead to a shi in preferences e.g. towards social or economic aspects. In particular stakeholders from industry must be integrated when technologies on the verge of commercialization should be assessed. In addition, the involvement of multiple interest groups omits the danger of biased weights that might favour specic technologies. In any case, the selection and involvement of participants should be transparent and equilibrated. 25 Case study specics The sustainability assessment of hydrogen mobility showed that FCEVs are best supplied with hydrogen by pipelines. However, it was also shown that applying the derived weighting factor sets does not have an impact on overall results of this case study. For hydrogen mobility this means robust results for choosing the most sustainable option under given weight elicitation conditions. For the method for sustainability assessment this means, that more case studies should be performed to test the MCDA approach, e.g. for batteries or second generation biofuels.
Aer all, the question arises, if the time-consuming process of stakeholder integration for weighting factor elicitation is necessary. Even though the chosen SMART method belongs to the less time-consuming ones. In our opinion, this is necessary because on the one hand results might be more sensitive to different weighting factors for other case studies and on the other hand the awareness of sustainability indicators within energy technology developers was risen. An MCDA with equal weighting can merely give a rough estimation for the comparison of energy technologies. Only the integration of stakeholders can give legitimation for decision-making processes.

Conflicts of interest
There are no conicts to declare.