> When the general is weak and without authority; when his orders are not clear and distinct;when there are no fixed duties assigned to officers and men, and the ranks are formedin a slovenly haphazard manner, the result is utter disorganization. ptg5994185—Sun Tzu > 当将军软弱无权时;指挥不明确,官兵职责不明确,队伍排列散乱,结果就是一片混乱。 ptg5994185 — 孙子 One of the easiest and most common ways for companies to fail in their scalabilityrelated endeavors is to not have clarity around the matter of who is responsible forwhat. Clearly defining high-level goals and objectives is a leadership responsibilityand defining the roles and responsibilities of executives, organizations, and individual contributors is a management responsibility. A lack of clarity can be disastrousfor the company and organization in a number of ways. In this chapter, we will startby taking a look at two very real examples of what might happen without role clarityand responsibility. We then will discuss the executive roles, the organizationalresponsibilities, individual contributor’s roles, and conclude by introducing a toolthat is extremely useful to ensure that initiatives have all the proper roles filled. 公司在与可扩展性相关的努力中失败的最简单和最常见的方式之一就是不明确谁负责什么。明确定义高层目标和目标是领导责任,定义高管、组织和个人贡献者的角色和责任是管理责任。缺乏清晰度可能会在很多方面给公司和组织带来灾难性的后果。在本章中,我们将首先看两个非常真实的例子,如果没有明确的角色和责任,可能会发生什么。然后,我们将讨论执行角色、组织责任、个人贡献者的角色,最后介绍一个非常有用的工具,以确保计划拥有所有适当的角色。 This chapter is meant for companies of all sizes. For large companies, it can serveas a checklist to ensure that you have covered all of the technology and executiveroles and responsibilities as they relate to scale. For small companies, it can helpjumpstart the process of ensuring that you have your scalability related roles properlydefined. For the technology neophyte, it is a primer for how technology organizations should work, and for seasoned technology professionals, it is a reminder toreview organizational structure to validate that you have your scalability relatedneeds covered. For all companies, it clearly defines the need for individual contributors through the chief executive to be involved with the scalability of the systems,organizations, and platforms that run their company. 本章适用于各种规模的公司。对于大公司来说,它可以作为一个清单,以确保您已涵盖与规模相关的所有技术、执行角色和职责。对于小公司来说,它可以帮助快速启动确保正确定义与可扩展性相关的角色的过程。对于技术新手来说,它是技术组织应该如何工作的入门读本,而对于经验丰富的技术专业人员来说,它会提醒您检查组织结构,以验证您是否满足了与可扩展性相关的需求。对于所有公司来说,它明确定义了个人贡献者通过首席执行官参与运营公司的系统、组织和平台的可扩展性的需求。 #### The Effects of Failure 失败的影响 On one end of the spectrum, a lack of clarity around roles and responsibilities mayresult in individuals or groups not performing a necessary task, which may in turnresult in one or more failures of your product, organization, or processes. Take forinstance the case where no team or individual is assigned the responsibility of capacity planning. In this context, capacity planning is the comparison of expected demandto systems capacity (supply or maximum capacity by type of request) resulting in aset of proposed actions to ensure that capacity matches demand. Expected demand isdefined by the forecasted number of requests, by function, placed on the system inquestion. The proposed set of actions may include requesting the purchase of additional servers, requesting architectural evaluation of system components to allow systems scale to meet demand, or a requesting that systems be modified to scale morecost effectively. 一方面,角色和职责不明确可能会导致个人或团体无法执行必要的任务,进而可能导致产品、组织或流程的一个或多个失败。以没有团队或个人被分配容量规划责任的情况为例。在这种情况下,容量规划是将预期需求与系统容量(按请求类型划分的供应或最大容量)进行比较,从而产生一系列建议的操作,以确保容量与需求相匹配。预期需求是由对相关系统提出的按功能预测的请求数量来定义的。提议的一组行动可能包括请求购买额外的服务器、请求对系统组件进行架构评估以允许系统扩展以满足需求,或者请求修改系统以更经济有效地扩展。 The flow for this example may start with a business unit creating a demand forecast and handing it off to the person responsible for capacity analysis and planning.The capacity planner takes a look at the types of demand forecasted by the businessunit and translates those into the resulting product/system/platform requests. He thenalso factors in the expected results of product/system changes that create new functionality and determines where the system will need modifications in order to meetnew demand, new functionality, or functionality modifications. The resulting deficiencies are then passed on to someone responsible for determining what actionsshould be taken to correct the expected deficiencies. Those actions as previously identified may be the purchase of new systems, a change in the architecture of certaincomponents of the platform, such as the data model or the movement of demandfrom one group of services to another group. 此示例的流程可能从业务部门创建需求预测并将其移交给负责容量分析和规划的人员开始。容量规划人员查看业务部门预测的需求类型,并将其转换为结果产品/系统/平台请求。然后,他还考虑了创建新功能的产品/系统更改的预期结果,并确定系统需要修改的位置以满足新需求、新功能或功能修改。然后将由此产生的缺陷传递给负责确定应采取哪些措施来纠正预期缺陷的人员。先前确定的那些行动可能是购买新系统、平台某些组件架构的更改(例如数据模型)或需求从一组服务转移到另一组。 In this case, the absence of a team or person responsible for matching expecteddemand to existing capacity and determining appropriate actions would be disastrous in an environment where demand is growing rapidly. Nevertheless, this failurehappens all the time—especially in young companies. Even companies that have aperson or organization responsible for capacity planning often fail to plan for theirnewest system additions. 在这种情况下,在需求快速增长的环境中,如果缺乏负责将预期需求与现有能力相匹配并确定适当行动的团队或人员,这将是灾难性的。然而,这种失败经常发生——尤其是在年轻的公司中。即使有专人或组织负责容量规划的公司也常常无法为其最新的系统添加进行规划。 On the other end of the spectrum is a case where organizations are given similarresponsibilities but are not required or incented to work together to successfully complete their objectives. If you are in a smaller company where everyone knows whateveryone else is doing, this may seem a bit ridiculous to you. Unfortunately, thisproblem exists in many of our larger client companies and when it happens it notonly wastes money and destroys shareholder value, it can create long-term resentment between organizations and destroy employee morale. 另一方面,组织被赋予类似的职责,但不被要求或激励共同努力以成功完成其目标。如果你在一家小公司,每个人都知道其他人在做什么,这对你来说可能有点荒谬。不幸的是,这个问题存在于我们许多较大的客户公司中,当它发生时,它不仅浪费金钱并破坏股东价值,还会在组织之间造成长期的怨恨并摧毁员工士气。 In this case, let’s assume that an organization is split between an engineering organization responsible primarily for developing software and an operations organiza-tion responsible primarily for building and deploying systems, creating and managingdatabases, deploying networks, etc. Let’s further assume that we have a relativelyinexperienced CTO who has recently read a book on the value of shared goals andobjectives and has decided to give both teams the responsibility of scaling the platform to meet expected customer demand. The company has a capacity planner whodetermines that to meet next year’s demand the teams must scale the subsystemresponsible for customer contact management to handle at least twice the number oftransactions it is capable of handling today. 在这种情况下,我们假设一个组织分为主要负责开发软件的工程组织和主要负责构建和部署系统、创建和管理数据库、部署网络等的运营组织。让我们进一步假设我们有一个相对缺乏经验的首席技术官,最近读了一本关于共同目标的价值的书,并决定让两个团队负责扩展平台以满足预期的客户需求。该公司有一位容量规划人员,他确定为了满足明年的需求,团队必须扩展负责客户联系管理的子系统,以处理至少是目前能够处理的交易数量的两倍。 Both the engineering and operations teams have architects who have read the technology section of our book and both decide to make splits of the database supportingthe customer contact management system. Both architects believe they are empowered to make the appropriate decisions without the help of the other architect as theyare unaware that multiple people have been assigned the same responsibility andwere not informed that they should work together. The engineering architect decidesthat a split along transaction boundaries (or functions of a Web site such as buyingan item and viewing an item on an ecommerce site) will work best, and the operations architect decides that a split along customer boundaries makes the most sense,where groups of customers all reside in separate databases. Both go about makinginitial plans for the split, setting their teams about doing their work and then makingrequests of the other team to perform some work. 工程和运营团队都有架构师阅读过我们书中的技术部分,并且都决定对支持客户联系管理系统的数据库进行拆分。两位架构师都认为,他们有权在没有另一位架构师帮助的情况下做出适当的决策,因为他们不知道多个人已被分配了相同的责任,并且没有被告知他们应该一起工作。工程架构师认为沿交易边界(或网站的功能,例如购买商品和在电子商务网站上查看商品)进行划分效果最佳,而运营架构师认为沿客户边界进行划分最有意义,其中客户组都驻留在单独的数据库中。双方都为拆分制定初步计划,让自己的团队做好自己的工作,然后要求对方团队完成一些工作。 This example may sound a bit ridiculous to you, but it happens all the time. Atbest, the two teams stop there and resolve the issue having “only” wasted the valuable time of two architects. Unfortunately, what usually happens is that the teamspolarize to waste even more time in political infighting, and the result isn’t materiallybetter after all the wasted time than if a single person or team had the responsibilityto craft the best solution with the input of other teams. 这个例子对你来说可能有点可笑,但它一直在发生。充其量,两个团队就到此为止并解决问题,“只是”浪费了两位架构师的宝贵时间。不幸的是,通常会发生的情况是,团队会两极分化,在政治内斗中浪费更多时间,而且在浪费了所有时间之后,结果并不比一个人或团队有责任根据其他团队的意见制定最佳解决方案更好。 。 #### Defining Roles 定义角色 This section gives an example of how you might define roles to help resolve issuessuch as those identified in the preceding. We have given examples of how roles mightbe defined at the top leadership level of the company (the executive team), withinclassic technology organizational structures, and at an individual contributor level. 本节提供了一个示例,说明如何定义角色来帮助解决诸如前面确定的问题。我们已经给出了如何在公司最高领导层(执行团队)、经典技术组织结构内以及个人贡献者级别定义角色的示例。 Our examples of executive, organizational, and individual contributor responsibilities are not meant to restrict you to specific job titles or organizational structure.Rather, they are to help outline the necessary roles within a company. We’ve chosento define these roles by the organizations in which they have traditionally existed tomake it easier to understand for a majority of our audience. For instance, you maydecide that you want operations, infrastructure, engineering, and QA to exist withinthe same teams with each dedicated to a specific product line. You may recall from our introductory discussion on organizational structure that there is no “right orwrong” answer on the topic—simply benefits and drawbacks of any decision. Theimportant point is to remember that you include all of the appropriate responsibilities in your organizational design and that you clearly define not only who is aresponsible decision maker but also who is responsible for providing input to anydecision, who should be informed of the decision and actions, and who is responsiblefor making the decision happen. We’ll discuss this last point in a brief section on avaluable tool later in this chapter. 我们的行政、组织和个人贡献者职责示例并不意味着将您限制于特定的职位或组织结构。相反,它们旨在帮助概述公司内的必要角色。我们选择按照传统上存在的组织来定义这些角色,以便大多数受众更容易理解。例如,您可能决定希望运营、基础设施、工程和 QA 存在于同一团队中,每个团队专门负责特定的产品线。您可能还记得我们对组织结构的介绍性讨论,该主题没有“正确或错误”的答案,只有任何决策的优点和缺点。重要的是要记住,您在组织设计中包含了所有适当的职责,并且您不仅明确定义了谁是负责任的决策者,而且还明确了谁负责为任何决策提供输入,谁应该被告知决策和行动,以及谁负责做出决定。我们将在本章后面关于有用工具的简短部分中讨论最后一点。 #### A Brief Note on Delegation 关于授权的简要说明 Before launching into the proposed division of responsibilities within an organization, wethought it important to include a brief note on delegation. In defining roles and responsibilitieswithin organizations, you are creating a blueprint of delegation. Delegation, broadly speaking,is the act of empowering someone else to act on your behalf. For instance, by giving an architect or architecture team the responsibility to design a system, you are delegating the work ofcreating that architecture to that team. You may also decide to delegate the authority to makedecisions to that team depending upon their capabilities, the size of your company, and so on. 在开始讨论组织内的职责分工之前,我们认为包括有关授权的简短说明很重要。在定义组织内的角色和职责时,您正在创建授权蓝图。从广义上讲,授权是授权他人代表您行事的行为。例如,通过让架构师或架构团队负责设计一个系统,您就将创建该架构的工作委托给了该团队。您还可以根据该团队的能力、公司规模等决定将决策权委托给该团队。 Here’s a very important point. You can delegate anything you would like, but you can neverdelegate the accountability for results. At best, the team or individual to whom you delegate caninherit that responsibility and you can ultimately fire, promote, or otherwise reward or punishthe team for its results but you should always consider yourself responsible for the end result.Great leaders get this intuitively and they put great teams ahead of themselves in success andtake public accountability for failures. Poor leaders assume that they can “pass the buck” forfailures and take credit for successes. 这是非常重要的一点。你可以委托任何你想要的事情,但你永远不能委托对结果的责任。最好的情况是,你委托的团队或个人可以继承这一责任,而你最终可以解雇、晋升或以其他方式奖励或惩罚团队的结果,但你应该始终认为自己对最终结果负责。伟大的领导者直观地理解这一点,他们让伟大的团队领先于自己的成功,并为失败承担公共责任。糟糕的领导者认为他们可以在失败时“推卸责任”,并在成功时将功劳归咎于他们。 To codify this point in your mind, let’s apply “the shareholder test.” Assume that you are theCEO of a company and you have decided to delegate the responsibility for one of your business units to a general manager. Can you imagine telling your board of directors or your shareholders (whom the board represents) that you will not be held accountable for the results of thatbusiness? One step removed, do you think the board will not hold you at least partially responsible if the business begins to underperform relative to expectations? 为了将这一点牢记在心,让我们应用“股东测试”。假设您是一家公司的首席执行官,并且您决定将您的一个业务部门的职责委托给总经理。您能想象告诉您的董事会或股东(董事会代表)您不会对该业务的结果负责吗?去掉一个步骤,如果企业的表现开始低于预期,您认为董事会不会让您承担至少部分责任吗? Again, this does not mean that you should make all the decisions yourself. As your company and team grow and scale, you simply won’t be able to do that and in many cases mightnot be qualified to make the decisions. For instance, a nontechnical CEO should probably notbe making architecture decisions and a CTO of a 200-person engineering organization shouldnot be writing the most important code as he or she is needed in other executive tasks. It simply makes the point that you absolutely must have the best people possible to whom you candelegate and that you must hold those people to the highest possible standards. It also meansthat you should be asking the best questions possible about how someone came to his or herdecisions on the most critical projects and systems. 再次强调,这并不意味着您应该自己做出所有决定。随着您的公司和团队的成长和规模扩大,您根本无法做到这一点,并且在许多情况下可能没有资格做出决策。例如,非技术性的 CEO 可能不应该做出架构决策,而 200 人的工程组织的 CTO 不应该编写最重要的代码,因为其他执行任务需要他或她。它只是简单地表明,您绝对必须拥有最优秀的人员来委派给他们,并且您必须让这些人达到尽可能高的标准。这也意味着您应该尽可能提出有关某人如何就最关键的项目和系统做出决定的最佳问题。 #### Executive Responsibilities 行政职责 The executives of a company as a team are responsible more than anyone else forimprinting the company with “scale DNA” and creating a culture of scalability asdefined in our introductory chapter. Getting high-level executive responsibilities rightis the easiest thing to do and also the most overlooked aspect of ensuring that organizations can scale to the need of the company and that further organizations supportthe need to scale the technology that makes the company money. 公司的高管作为一个团队比其他任何人都更有责任为公司打上“规模 DNA”的印记,并创建我们在介绍性章节中定义的可扩展文化。正确履行高层管理职责是最容易做的事情,也是最容易被忽视的方面,以确保组织能够根据公司的需求进行扩展,并确保更多的组织支持扩展为公司赚钱的技术的需要。 #### CEO CEO The CEO is the chief scalability officer of the company. As with all other matterswithin the company, when it comes to scale, he or she is the final decision maker andarbiter of all things related to scale. A good technology company CEO needs to beappropriately technically proficient, but that does not mean that he needs to be thetechnology expert or the primary technology decision maker. 首席执行官是公司的首席可扩展官。与公司内的所有其他事务一样,当涉及到规模时,他或她是所有与规模相关的事情的最终决策者和仲裁者。一个好的科技公司CEO需要具备适当的技术精通,但这并不意味着他需要成为技术专家或主要技术决策者。 It is hard to imagine that someone would rise to the position of CEO and notunderstand how to read a balance sheet, income statement, or statement of cash flow.That same person, unless she has an accounting background or is a former CFO, isnot likely to understand the intricacies of each accounting policy nor should she needto. The CEO’s job is to ask the right questions, get the right people involved, and getthe right outside help or advice to arrive at the right answer. 很难想象一个人会升到首席执行官的位置而不了解如何阅读资产负债表、损益表或现金流量表。同一个人,除非她有会计背景或是前首席财务官,否则不太可能她也不需要了解每项会计政策的复杂性。首席执行官的工作是提出正确的问题,让合适的人参与进来,并获得正确的外部帮助或建议以得出正确的答案。 The same holds true in the technical world—the CEO’s job is to understand someof the basics (the equivalent of the financial statements mentioned above), to knowwhich questions to ask, and to know where to get help. Here is some advice forCEOs and other managers responsible for technical organizations who have not beenthe chief technology officer or chief information officer of a company, do not havetechnical degrees, or have never been an engineer: 在技术领域也是如此——首席执行官的工作是了解一些基础知识(相当于上面提到的财务报表),知道要问哪些问题,并知道从哪里获得帮助。以下是给未担任过公司首席技术官或首席信息官、没有技术学位或从未担任过工程师的首席执行官和其他负责技术组织的经理的一些建议 **Ask Questions and Look for Consistency in Explanations ** Part of your job is to be atruth seeker, because only with the truth can you make sound and timely decisions.Although we do not think it is commonplace for teams to lie to you, it is very common for teams to have different pieces and perceptions of the truth, especially when itcomes to issues of scale. When you do not understand something, or something doesnot seem right, ask questions. When you are unable to discern fact from perception,look for consistency in answers. If you can get over any potential ego or pride issueswith asking what might seem to be ignorant questions, you will find that you notonly quickly educate yourself but you will create and hone a very important skill infinding truth. 提出问题并寻求解释的一致性你的工作之一就是成为真相寻求者,因为只有真相你才能做出正确而及时的决定。虽然我们认为团队对你撒谎并不常见,但这种情况很常见让团队对事实有不同的看法和看法,尤其是在涉及规模问题时。当你不明白某件事或某事看起来不对时,提出问题。当你无法区分事实和感知时,寻找答案的一致性。如果你能够通过提出看似无知的问题来克服任何潜在的自我或骄傲问题,你会发现你不仅可以快速教育自己,而且可以创造和磨练寻找真理的非常重要的技能。 This executive interrogation is a key ability shared by many successful leaders.Knowing when to probe and where to probe and probing until you are satisfied with answers need not be limited to the CEO. In fact, managers and individual contributors should all hone this skill and start early in their careers. 这种高管审问能力是许多成功领导者所共有的一项关键能力。知道何时探询、在何处探询并探询直到得到满意的答案并不局限于首席执行官。事实上,管理者和个人贡献者都应该磨练这项技能,并在职业生涯的早期开始。 **Seek Outside Help** Seek help from friends or professionals who are proficient andknowledgeable in the area of scalability. Don’t bring them in and attempt to havethem sort things out for you—that can be very damaging. Rather, we suggest creatinga professional or personal relationship with a technically literate firm or peer. Leverage that relationship to help you ask the right questions and evaluate the answerswhen you need to dive deeply. 寻求外部帮助 向精通可扩展性领域且知识渊博的朋友或专业人士寻求帮助。不要让他们进来并试图让他们为你解决问题——这可能会造成很大的破坏。相反,我们建议与具有技术素养的公司或同行建立专业或个人关系。当您需要深入研究时,利用这种关系来帮助您提出正确的问题并评估答案。 **Improve Your Scalability Proficiency ** Create a list of your weaknesses in technology—things about which you have questions—and go get help to become smarter.You can ask questions of your team and outside professionals. Read blogs on scalerelated issues relevant to your company or product and attend workshops on technology for people without technical backgrounds. You probably already do thisthrough professional reading lists on other executive topics—add technology scalability to the list. You do not need to learn a programming language, understand howan operating system or database works, or understand how “Collision DetectionMultiple Access/Carrier Detect” is implemented. You just need to be able to get better at asking questions and evaluating the issues your teams bring to you. Scalabilityis a business issue, but to solve it, you need to at least be somewhat conversant in thetechnology portion of the equation. 提高您的可扩展性熟练程度创建一个您在技术方面的弱点列表(您对此有疑问的事情),然后寻求帮助以变得更聪明。您可以向您的团队和外部专业人士提出问题。阅读与您的公司或产品相关的规模相关问题的博客,并参加为没有技术背景的人员举办的技术研讨会。您可能已经通过其他执行主题的专业阅读列表来做到这一点 - 将技术可扩展性添加到列表中。您不需要学习编程语言,了解操作系统或数据库的工作原理,或了解“冲突检测多路访问/载波检测”是如何实现的。您只需要能够更好地提出问题并评估团队给您带来的问题。可扩展性是一个业务问题,但要解决它,您至少需要稍微熟悉等式的技术部分。 More than likely, the CEO will decide to delegate authority to several members ofhis or her team including the chief financial officer (CFO), individual business unitowners (a.k.a. general managers), and the head engineering and technology executive(referred to as either the CTO or CIO in our book). 首席执行官很可能会决定将权力委托给他或她的团队的几位成员,包括首席财务官 (CFO)、各个业务单位所有者(又称总经理)以及首席工程和技术主管(称为 CTO)或我们书中的 CIO)。 #### CFO 首席财务官 Most likely, the CEO has delegated the responsibility for budgeting to the CFO,although this may not always be the case. Budgeting is informed by capacity planningand, as we’ve seen in our previous example of how things can go wrong, capacityplanning is a very large part of successfully scaling a system. Ensuring that the teamand company have sufficient budget to scale the platform/product/system is a keyportion of the budgeting officer’s responsibility. The budget needs to be sufficientlylarge to allow the company to scale to the expected demand by purchasing or leasingservers and hiring the appropriate engineers and operations staff. That said, the budget should not be so large that the company spends money on scale long before ittruly needs it because such spending dilutes near term net income for very little benefit. Purchasing and implementing “just in time” systems and solutions optimizes thecompany’s net income and cash flow. 最有可能的是,首席执行官已将预算责任委托给首席财务官,尽管情况可能并非总是如此。预算是根据容量规划来制定的,正如我们在之前的示例中所看到的,容量规划是成功扩展系统的一个非常重要的部分。确保团队和公司有足够的预算来扩展平台/产品/系统是预算官职责的关键部分。预算需要足够大,以便公司能够通过购买或租赁服务器以及雇用适当的工程师和运营人员来扩展以满足预期需求。也就是说,预算不应该太大,以至于公司在真正需要之前就大规模花钱,因为这种支出会稀释短期净利润,而带来的好处微乎其微。购买和实施“及时”系统和解决方案可以优化公司的净利润和现金流。 The CFO is also not likely to be very technical, but can benefit from asking theright questions and creating an appropriate network, just as we described with theCEO. Questions that the CFO might ask regarding scalability include asking whatother scale alternatives were considered in developing the proposed budget for scaleand what tradeoffs were made in deciding upon the existing approach. The intenthere is to ensure that the team considered more than one option. A bad answer wouldbe “This is the only way possible,” as that is rarely the case. (We want to say it is neverthe case, but it is possible to have a case where only one route is possible.) A goodanswer might be “of the options we evaluated, this one allows us to scale horizontally at comparatively low cost while setting us up to scale even more cost effectivelyin the future by laying a framework whereby we can continue our horizontal scale.” 首席财务官也不太可能精通技术,但可以从提出正确的问题和创建适当的网络中受益,正如我们与首席执行官所描述的那样。首席财务官可能会询问有关可扩展性的问题,包括询问在制定规模拟议预算时考虑了哪些其他规模替代方案,以及在决定现有方法时进行了哪些权衡。这样做的目的是确保团队考虑不止一种选择。一个糟糕的答案是“这是唯一可能的方法”,因为这种情况很少发生。 (我们想说的是,情况并非如此,但有可能出现只有一条路线可行的情况。)一个好的答案可能是“在我们评估的选项中,这个选项允许我们以相对较低的成本水平扩展,同时设置通过建立一个框架,我们可以继续横向扩展,从而在未来以更具成本效益的方式进行扩展。 #### Business Unit Owners, General Managers, and P&L Owners 业务部门所有者、总经理和损益表所有者 More than any other position, the business unit general manager or owner of thecompany or division’s profit and loss statement (also called the income statement orP&L) is responsible for forecasting the platform/product/system dependent businessgrowth. In small- to medium-sized companies, it is very likely that the business unitowner is the CEO and that he or she might delegate this responsibility to some member of her staff. Nevertheless, demand projections are critical to the activity of determining what needs to be scaled so that the budget for scale doesn’t become too largeahead of the corporate need. 与任何其他职位相比,业务部门总经理或公司或部门损益表(也称为损益表或损益表)的所有者更负责预测依赖于平台/产品/系统的业务增长。在中小型公司中,业务单位所有者很可能是首席执行官,他或她可能会将这一职责委托给其员工的某些成员。尽管如此,需求预测对于确定需要扩展的活动至关重要,这样规模预算就不会超出企业需求。 Very often, we run into situations in which we hear the business unit owner claiming that demand simply can’t be forecasted. Here, demand means the number ofrequests that are placed against a system or product. This is a punting of responsibility that simply should not be tolerated within any company. In essence, the lack ofownership on forecasting demand by the business gets moved to the technology organization, which in turn is very likely less capable of forecasting demand than the business unit owner. Yes, it is very likely that your forecasts will be wrong, especially intheir infancy, but it is absolutely critical that you start the process early in the lifecycle of the company and mature it over time. 我们经常遇到业务部门负责人声称需求根本无法预测的情况。这里,需求是指针对系统或产品提出的请求数量。这是任何公司都不能容忍的一种责任押注。从本质上讲,业务缺乏预测需求的所有权转移到了技术组织,而技术组织预测需求的能力很可能不如业务部门所有者。是的,你的预测很可能是错误的,尤其是在婴儿期,但在公司生命周期的早期开始这个过程并随着时间的推移使其成熟是绝对重要的。 Finally, as with other senior executive staff of the company, the business unitowner is responsible for helping to create a culture of scalability. Ensuring that he orshe is asking the right questions of her peer (or subordinate) in the technology organization and trying to ensure that the technology partner receives the funding andsupport to properly support the business unit in question are all essential to the success of scalability within the company. 最后,与公司的其他高级管理人员一样,业务部门所有者有责任帮助创建可扩展的文化。确保他或她向技术组织中的同事(或下属)提出正确的问题,并努力确保技术合作伙伴获得资金和支持以适当支持相关业务部门,这些对于技术组织内可扩展性的成功至关重要。公司。 #### CTO/CIO 首席技术官/首席信息官 Although the CEO is the chief scalability officer of the company, the chief technologyexecutive is the chief technology, technical process, and technology organization scalability officer. In some companies, particularly Internet companies, the chief technology executive is often titled the CTO or chief technology officer. In these companies, the CTO might be responsible for another technology executive responsible forcorporate technology, or the technology that runs the back office systems of the company, and this person is often titled the CIO or chief information officer. In oldercompanies, the chief technology executive is often titled the CIO, whereas the CTO isvery often the head engineer or head architect. We will use CTO and CIO throughoutthis book to mean, interchangeably, the chief technology executive of the company.He or she most likely has the best background and capabilities to ensure that thecompany scales cost effectively ahead of the product/system or platform needs. 尽管首席执行官是公司的首席可扩展官,但首席技术执行官是首席技术、技术流程和技术组织可扩展官。在一些公司,特别是互联网公司,首席技术执行官通常被称为CTO或首席技术官。在这些公司中,CTO 可能负责另一位技术主管,负责公司技术或运行公司后台系统的技术,而此人通常被称为 CIO 或首席信息官。在老牌公司中,首席技术执行官通常被称为 CIO,而 CTO 通常是首席工程师或首席架构师。在本书中,我们将交替使用 CTO 和 CIO 来表示公司的首席技术执行官。他或她很可能拥有最好的背景和能力,可以确保公司在产品/系统或平台需求之前实现成本有效的扩展。 In essence, “the buck stops here.” Although it is true that the CEO can’t truly“delegate” responsibility for the success of the platform scalability initiatives, it isalso true that the chief technology executive inherits that responsibility and shares itwith the CEO. A failure to properly scale will likely at least result in the terminationof the chief technology executive, portions of his or her organization, and potentiallyeven the CEO. 从本质上讲,“责任到此为止”。尽管首席执行官确实无法真正“委托”平台可扩展性计划成功的责任,但首席技术执行官也确实继承了这一责任并与首席执行官共同承担。未能适当扩展可能至少会导致首席技术执行官、其组织的部分人员甚至首席执行官的解职。 The CTO/CIO must create the technical vision of the company overall, and for thepurposes of our discussion, within a growth company that vision must include elements of scale. The chief technology executive is further responsible for setting theaggressive, measurable, and achievable goals that nest to that vision and for ensuringthat his or her team is appropriately staffed to accomplish the associated scalabilitymission of the organization. The CTO/CIO is responsible for the development of theculture and processes surrounding scalability that will help ensure that the companyis always ahead of end-user demand. CTO/CIO 必须制定公司整体的技术愿景,并且出于我们讨论的目的,在成长型公司中,该愿景必须包括规模要素。首席技术执行官还负责设定与该愿景相关的积极的、可衡量的和可实现的目标,并确保他或她的团队配备适当的人员来完成组织的相关可扩展性任务。 CTO/CIO 负责围绕可扩展性发展文化和流程,这将有助于确保公司始终领先于最终用户的需求。 The CTO/CIO will absolutely need to delegate responsibilities for certain aspectsof decision making around scalability as the company grows, but as we pointed outpreviously this never eliminates his or her responsibility to ensure that it is done correctly, on time, and on budget. Additionally, in hyper-growth environments wherescale is critical to company survival, the CTO should never delegate the developmentof the vision for scale. The term “lead from the front” is never more important thanhere, and the vision does not need to be deeply technical. 随着公司的发展,CTO/CIO 绝对需要委派有关可扩展性决策的某些方面的责任,但正如我们之前指出的,这永远不会消除他或她确保正确、按时、按预算完成决策的责任。此外,在高速增长的环境中,规模对于公司的生存至关重要,首席技术官永远不应该将规模愿景的发展委托给他人。 “前线领导”这个词从来没有比这里更重要的,而且愿景不需要太深的技术性。 Although the best CTOs we have seen have had technology backgrounds varyingfrom once having been an individual contributor to having been a systems analyst ortechnical project manager, we have seen examples of successful CTOs without suchbackgrounds. When you have a nontechnical CTO/CIO, it is absolutely critical thathe or she has some technical acumen and is capable of speaking the language andunderstanding the critical tradeoffs within technology such as the relationship oftime, cost, and quality. Inserting a technology neophyte to lead a technical organization is akin to throwing a nonswimmer overboard into a lake; you may be pleasedwith your results assuming the person can swim, but more often than not you’regoing to need to find yourself a new partner in your boat. 尽管我们见过的最好的 CTO 拥有不同的技术背景,从曾经的个人贡献者到曾经的系统分析师或技术项目经理,但我们也看到过没有这种背景的成功 CTO 的例子。当您拥有非技术型 CTO/CIO 时,绝对重要的是他或她具有一定的技术敏锐性,并且能够使用该语言并理解技术中的关键权衡,例如时间、成本和质量的关系。让一个技术新手来领导一个技术组织就像把一个不会游泳的人扔进湖里一样;假设这个人会游泳,你可能会对你的结果感到满意,但通常你需要在船上找到一个新的伙伴。 Equally important is that the CTO have some business acumen. Unfortunately,this is as difficult to achieve as finding a chief marketing officer with a Ph.D. in electrical engineering (not that you’d necessarily want one)—they exist but they are difficult to find. Unfortunately, most technologists do not learn about business, finance,or marketing within their undergraduate or graduate courses. Although the CTOdoes not need to be an expert on capital markets (that’s likely the job of the CFO), heshould understand the fundamentals of the business in which the company operates.For example, the CTO should be able to read and understand the relationshipsbetween the income statement, balance sheet, and statement of cash flow. She shouldalso understand marketing basics to the level of at least a community college or company sponsored course on the subject. This is not to say that the CTO needs to be anexpert in any of these areas; rather, a basic understanding of these topics is critical tomaking the business case for scalability and to being able to communicate effectivelyin the business world. We’ll discuss these areas in later chapters. 同样重要的是CTO有一定的商业头脑。不幸的是,这就像找到一位拥有博士学位的首席营销官一样困难。在电气工程中(并不是你一定想要一个)——它们存在,但很难找到。不幸的是,大多数技术专家并没有在本科或研究生课程中学习商业、金融或营销。虽然 CTO 不需要是资本市场方面的专家(这可能是 CFO 的工作),但他应该了解公司所经营业务的基本原理。例如,CTO 应该能够阅读和理解公司之间的关系。损益表、资产负债表和现金流量表。她还应该了解营销基础知识,至少达到社区大学或公司赞助的该主题课程的水平。这并不是说 CTO 需要成为这些领域的专家;而是说 CTO 需要成为这些领域的专家。相反,对这些主题的基本理解对于制定可扩展性的业务案例以及能够在商业世界中有效沟通至关重要。我们将在后面的章节中讨论这些领域。 #### Organizational Responsibilities 组织职责 We are going to describe roles in terms of organizational responsibilities within a traditionally constructed technology team. These usually consist of teams responsiblefor the overall architecture of the product (architecture), the software engineering ofthe product (engineering), the monitoring and production handling of the product(operations), design and deployment of hardware for the product (infrastructureengineering), and the testing of the product (quality assurance). 我们将根据传统构建的技术团队中的组织职责来描述角色。这些通常由负责产品整体架构(架构)、产品软件工程(工程)、产品监控和生产处理(操作)、产品硬件设计和部署(基础设施工程)的团队组成。产品测试(质量保证)。 The choice to define these within organizations was a tradeoff. We wanted toensure that everyone had a list of scalability related responsibilities that need to existwithin any organization. This could be accomplished with a simple list of responsibilities that could be parceled out to any organizational structure. We also wanted anumber of teams to be able to use the responsibilities out of the book immediately,which was best served by defining those responsibilities within the traditional organizational constructs. In no way do we mean to imply, however, that this is the onlyway to set up responsibilities for your organizations. You should develop the organizational structure that best serves your needs and ensure that all of the responsibilities included in the following sections are contained within one of your teams. 选择在组织内定义这些是一种权衡。我们希望确保每个人都有一份任何组织内都需要存在的与可扩展性相关的职责列表。这可以通过可以分配给任何组织结构的简单职责列表来完成。我们还希望许多团队能够立即使用书中的职责,最好的方法是在传统的组织结构中定义这些职责。然而,我们绝不意味着这是为您的组织设立职责的唯一方法。您应该制定最能满足您需求的组织结构,并确保以下部分中包含的所有职责都包含在您的一个团队中。 ##### Architecture Responsibilities 架构职责 The team responsible for architecture is responsible for ensuring that the design andarchitecture of the system allow for scale in the timeframe appropriate to the business. Here, we clearly indicate a difference between the intended design and theactual implementation. The team or teams responsible for architecture decisions needto think well in advance of the needs of the business and have thought through howto scale the system long before the business unit owners forecast demand exceedingthe platform capacity at any given time. For instance, the architecture team may havedeveloped an extensible data access layer (DAL) or data access object (DAO) that canallow for multiple physical databases to be accessed with varying schemas as userdemand increases in any given area. The actual implementation may be such thatonly a single database is used, but with some cost-effective modification of the DAL/DAO and some work creating migration scripts, additional databases can be stoodup in the production environment in a matter of weeks rather than months shouldthe need arise. The architecture team is further responsible for creating the set ofarchitecture standards by which engineers design code and implement systems. 负责架构的团队负责确保系统的设计和架构能够在适合业务的时间范围内进行扩展。在这里,我们清楚地指出了预期设计和实际实现之间的差异。负责架构决策的一个或多个团队需要提前考虑业务需求,并在业务部门所有者预测任何给定时间的需求超过平台容量之前就考虑如何扩展系统。例如,架构团队可能开发了可扩展的数据访问层 (DAL) 或数据访问对象 (DAO),当任何给定区域中用户需求增加时,它们可以允许使用不同的模式访问多个物理数据库。实际实施可能只使用单个数据库,但通过对 DAL/DAO 进行一些经济有效的修改以及创建迁移脚本的一些工作,如果需要,可以在几周而不是几个月的时间内在生产环境中建立其他数据库。需要出现。架构团队还负责创建一套架构标准,工程师可以根据这些标准设计代码和实现系统。 The architecture team, more than any other team, is responsible for designing asystem and having designs ready to solve any scale related issue. In Part II, BuildingProcesses for Scale, we identify a key process that the architecture team should adoptto help identify scale related problems across all of the technology disciplines. 架构团队比任何其他团队都更负责设计系统并准备好设计来解决任何与规模相关的问题。在第二部分“规模化构建流程”中,我们确定了架构团队应采用的关键流程,以帮助识别所有技术学科中与规模相关的问题。 Architects may also be responsible for forming information technology (IT) governance, standards, and procedures, and enforcement of those standards through suchprocesses as the Architecture Review Board discussed in Chapter 14, ArchitectureReview Board. When architects perform these roles, they do so at the request of thechief technology executive. Some larger companies may create process engineeringteams responsible for procedure definition and standards enforcement. 架构师还可能负责形成信息技术 (IT) 治理、标准和程序,并通过第 14 章架构审查委员会中讨论的架构审查委员会等流程执行这些标准。当架构师扮演这些角色时,他们是应首席技术主管的要求而这样做的。一些较大的公司可能会创建流程工程团队,负责程序定义和标准执行。 ##### Engineering Responsibilities 工程职责 This team is “where the rubber meets the road.” The engineering team is the chiefimplementer of the scalability mission and the chief tuner of the product platform.Engineers take the architecture and create lower-level designs that they ultimatelyimplement within code. They are responsible for adhering to the company’s architectural standards. Engineering teams are one of the two or three teams most likely totruly understand the limits of the system as implemented given that they are one ofthe teams with the greatest daily involvement with that system. As such, they are keycontributors to the process of identifying future scale issues. 这个团队是“橡胶与道路相遇的地方”。工程团队是可扩展性任务的主要实施者和产品平台的主要调优者。工程师采用架构并创建较低级别的设计,最终在代码中实现。他们负责遵守公司的架构标准。工程团队是最有可能真正了解系统实施的局限性的两三个团队之一,因为他们是日常参与该系统最多的团队之一。因此,它们是识别未来规模问题过程的关键贡献者。 ##### Production Operations Responsibilities 生产运营职责 The production operations team is responsible for running the hardware systems andsoftware systems necessary to complete the mission of the company. In the Softwareas a Service and Web2.0 worlds, this is the team responsible for running and monitoring the systems that create the company’s revenue. In a classic information technology organization, such as those that might exist in a bank, these members areresponsible for running the applications and systems that handle the bank’s dailytransactions, and so on. In a company producing a manufactured product such as a company in the automotive industry, this team is responsible for handling all of thecompany’s manufacturing systems, enterprise resource planning systems, and so on. 生产运营团队负责运行完成公司使命所需的硬件系统和软件系统。在软件即服务和 Web2.0 世界中,该团队负责运行和监控为公司创造收入的系统。在传统的信息技术组织中,例如银行中可能存在的组织,这些成员负责运行处理银行日常交易的应用程序和系统等。在生产制造产品的公司(例如汽车行业的公司)中,该团队负责处理公司的所有制造系统、企业资源规划系统等。 This team is part of the group of three teams with excellent insight into the limitations of the system as currently implemented. As the team interacts with how the system runs every day and as it has daily insight into system utilization data, these teammembers are uniquely qualified to identify bottlenecks within the system. 该团队是由三个团队组成的团队的一部分,他们对当前实施的系统的局限性有着深刻的洞察力。由于团队每天都会与系统的运行方式进行交互,并且每天都会深入了解系统利用率数据,因此这些团队成员具有独特的资格来识别系统内的瓶颈。 Often, this team is responsible for creating utilization reports, daily downtime,and activity reports, and is responsible for escalating issues and managing issues toresolution. As such, very often, capacity planning will fall onto this team, althoughthat is not an absolute necessity. Operations personnel are also typically responsiblefor creating reports that show trended availability over time, bucketing root causeand corrective actions, and determining mean time to resolution and mean time torestoration for various problems. 通常,该团队负责创建利用率报告、每日停机时间和活动报告,并负责升级问题和管理问题直至解决。因此,容量规划通常会落在这个团队的肩上,尽管这并不是绝对必要的。操作人员通常还负责创建显示随时间变化的可用性趋势的报告、列出根本原因和纠正措施,并确定各种问题的平均解决时间和平均恢复时间。 Regardless of the composition of the team, the organization responsible for monitoring and reporting on the health of systems, applications, and quality of serviceplays a crucial role in helping to identify scale issues. The processes that this groupemploys to manage issue and problem resolution should feed information into otherprocesses that help identify scale issues in advance of major outages. The data thatthe operations organization collects is incredibly valuable to those performing capacity planning as well as those responsible for designing away systemic and recurringissues such as scale related events. The architecture and engineering teams relyheavily on product operations to help them identify what should be fixed and when.We discuss some of these processes in Part II and more specifically in Chapter 8,Managing Incidents and Problems, Chapters 13, Joint Architecture Design, andChapter 14, Architecture Review Board. 无论团队的组成如何,负责监控和报告系统、应用程序和服务质量的健康状况的组织在帮助识别规模问题方面都发挥着至关重要的作用。该小组用于管理问题和解决问题的流程应将信息提供给其他流程,以帮助在重大停机之前识别规模问题。运营组织收集的数据对于执行容量规划的人员以及负责设计消除系统性和重复性问题(例如规模相关事件)的人员来说非常有价值。架构和工程团队严重依赖产品运营来帮助他们确定应该修复什么以及何时修复。我们在第二部分中讨论了其中一些流程,更具体地在第 8 章“管理事件和问题”、第 13 章“联合架构设计”和第 14 章中讨论。 ,架构审查委员会。 ##### Infrastructure Responsibilities 基础设施责任 This organization is typically comprised of database administrators, network engineers, and systems administrators. They are often responsible for defining which systems will be used, when systems should be purchased, and when systems should beretired. This group is also one of the three groups interacting with the holistic system,platform, or product on a daily basis; as such, these members are uniquely qualifiedto help identify where bottlenecks exist. Their primary responsibility is to identifycapacity constraints on the systems, network devices, and databases that they support and to further help in identifying appropriate fixes for scale related issues. 该组织通常由数据库管理员、网络工程师和系统管理员组成。他们通常负责定义将使用哪些系统、何时应购买系统以及何时应停用系统。该群体也是日常与整体系统、平台或产品交互的三个群体之一;因此,这些成员具有独特的资格来帮助确定存在瓶颈的地方。他们的主要职责是确定他们支持的系统、网络设备和数据库的容量限制,并进一步帮助确定规模相关问题的适当修复方案。 ##### Quality Assurance Responsibilities 质量保证责任 In the ideal scenario, the team responsible for testing an application to ensure it isconsistent with the company’s product or systems requirements will also play a rolein advanced testing for scale. New products, features, and functionality change the demand characteristics of a system, platform, or product. Most often, we are addingnew functions that by definition create additional demand on a system. Ideally, wecan profile that new demand creation to ensure that the release of our new functionality or features won’t have a significant impact to the production environment. TheQA organization also needs to be aware of all other changes going on around themso that it can ensure that whatever scale related testing is done gets updated in atimely fashion. 在理想情况下,负责测试应用程序以确保其与公司的产品或系统要求一致的团队还将在高级规模测试中发挥作用。新产品、特性和功能改变了系统、平台或产品的需求特征。大多数情况下,我们添加的新功能根据定义会对系统产生额外的需求。理想情况下,我们可以分析新需求的创建,以确保新功能或特性的发布不会对生产环境产生重大影响。 QA 组织还需要了解周围发生的所有其他变化,以便确保所做的任何规模相关测试都能及时更新。 ##### Capacity Planning Responsibilities 容量规划职责 This organization or responsibility can reside nearly anywhere, but it needs access toup-to-date information regarding system, product, and platform performance.Capacity planning is a key to scaling efficiently and cost effectively. When performedwell, the capacity planning process results in the timely purchase of equipment wheresystems are easily horizontally scaled, the emergency purchase of larger equipmentwhere systems cannot yet be scaled horizontally, and the identification of systemsthat should be prioritized high on the list of scale related problems to correct. 该组织或职责几乎可以驻留在任何地方,但它需要访问有关系统、产品和平台性能的最新信息。容量规划是高效且经济有效地扩展的关键。如果执行得好,容量规划过程可以导致及时购买系统易于水平扩展的设备,紧急购买系统尚无法水平扩展的大型设备,以及识别应在规模相关问题列表中优先考虑的系统,以解决系统规模相关问题。正确的。 You may notice that we use the word emergency when describing the purchase ofa larger system. Many companies take the approach that “scaling up” is an effectivestrategy, but our position, as we will describe in Chapters 21 through 25, is that ifyour scaling strategy relies on faster and bigger hardware, your solution does notscale; rather, you are relying upon the scalability of your providers to allow you toscale. Stating that you scale by moving to bigger and faster hardware is like statingthat you are fast by buying a bigger, faster car. You have not worked to becomefaster, and you are only as fast as anyone else with similar wealth. Scalability is theability to scale independent of bigger and faster systems or the next release of anapplication server or database. 您可能会注意到,在描述购买大型系统时,我们使用紧急一词。许多公司都认为“扩展”是一种有效的策略,但正如我们将在第 21 章到第 25 章中描述的那样,我们的立场是,如果您的扩展策略依赖于更快、更大的硬件,那么您的解决方案就无法扩展;相反,您依赖于提供商的可扩展性来允许您进行扩展。说你通过转向更大、更快的硬件来扩大规模,就像说你通过购买更大、更快的汽车来加快速度一样。你并没有努力变得更快,你只是和拥有相似财富的其他人一样快。可扩展性是指独立于更大、更快的系统或应用程序服务器或数据库的下一个版本进行扩展的能力。 #### Individual Contributor Responsibilities and Characteristics 个人贡献者的责任和特征 Having just described the scalability related roles that should be covered by differentorganizations within your company, we will now describe the roles of individualsthat might fit within different organizations. We will cover the role of the architect,the software engineer, the operator, the infrastructure engineer, the QA analyst, andthe capacity planner. These roles may not need to be staffed by a single person or agroup of people if you are a small company; it is enough in small companies to havethe responsibilities defined within each of the roles assigned to different individualswithin your organization. 刚刚描述了公司内不同组织应涵盖的可扩展性相关角色,现在我们将描述可能适合不同组织的个人角色。我们将涵盖架构师、软件工程师、操作员、基础设施工程师、QA 分析师和容量规划师的角色。如果您是一家小公司,这些角色可能不需要由一个人或一群人担任;在小公司中,将每个角色中的职责定义分配给组织内的不同个人就足够了。 ##### Architect 建筑师 More than any other role, the architect is responsible for the availability, scalability,and technical success of the product, platform, or system design. When it comes toscalability, the great architect will have an answer for how he or she expects to scaleany given component of the system and be able to explain why his or her approach isthe most cost-effective solution available for that component. 与其他角色相比,架构师对产品、平台或系统设计的可用性、可扩展性和技术成功负责。当谈到可扩展性时,伟大的架构师将知道他或她希望如何扩展系统的任何给定组件,并能够解释为什么他或她的方法是该组件可用的最具成本效益的解决方案。 The architect must know the end user, have a holistic view of the system, understand the cost of operating the system in its current design and implementation, andhave a deep knowledge of all technologies employed to create the system, platform,or product. Too often, architects will work out of “ivory towers” and not reallyknow how the product, platform, or system “really” operates. They may get toomuch into “markitecture,” the creation of slides to impress others with their intelligence, and stray too far from the nuts and bolts of how things really work. 架构师必须了解最终用户,对系统有全面的了解,了解当前设计和实现中操作系统的成本,并对用于创建系统、平台或产品的所有技术有深入的了解。很多时候,架构师会在“象牙塔”中工作,并不真正了解产品、平台或系统“真正”如何运作。他们可能会过多地关注“市场结构”,即制作幻灯片以用自己的智慧给别人留下深刻的印象,而偏离了事情真正运作的具体细节。 The architect needs to be an evangelist for the appropriate way to solve scalerelated issues. She needs to be aware of emerging technologies and how those mightbe employed to win the scalability battle. Great architects understand and have a history with both the software and the systems that comprise the production environment and facilitate the product, platform, or holistic system in question. 建筑师需要成为解决规模相关问题的适当方法的传播者。她需要了解新兴技术以及如何利用这些技术来赢得可扩展性之战。伟大的架构师了解并拥有构成生产环境的软件和系统的历史,并促进相关产品、平台或整体系统的发展。 When it comes to scale initiatives, the architect should be measured by the trueperformance of the system. Has it had availability or performance related issues as aresult of the architect’s design? 当涉及到扩展计划时,架构师应该通过系统的真实性能来衡量。由于架构师的设计,它是否存在与可用性或性能相关的问题? For truly hyper-growth companies, we suggest the creation of a specialized architect focused on platform, product, or system scalability. We believe that there is sufficient specificity in technical knowledge, perspective, and focus unique to scaleinitiatives that companies undergoing extreme growth need someone with a focusjust on scaling the system. The ideal candidate for such a position should be able toexplain how to split both systems and applications along the lines we discuss inChapters 21 through 24. Furthermore, the architect ideally comes with a resume indicating how he has performed such splits in the past. We call this unique position ascalability architect. 对于真正高速增长的公司,我们建议创建一个专注于平台、产品或系统可扩展性的专业架构师。我们相信,规模计划所特有的技术知识、观点和重点具有足够的特殊性,经历极度增长的公司需要专注于扩展系统的人。这一职位的理想候选人应该能够解释如何按照我们在第 21 章到第 24 章中讨论的方式拆分系统和应用程序。此外,架构师最好附带一份简历,表明他过去如何执行此类拆分。我们将这个独特的职位称为可扩展性架构师。 ##### Software Engineer 软件工程师 A software engineer is a member of the team responsible for implementing functionality and product changes and additions in software. The software engineer is alsoresponsible for coding any proprietary changes that allow a system to be more highlyscalable. 软件工程师是负责实现软件功能和产品变更和添加的团队成员。软件工程师还负责对任何专有更改进行编码,以使系统具有更高的可扩展性。 The software engineer, more than any other role, is responsible for the scalabilityof his portion of the system as it is implemented. Here, we call out the differencebetween design and implementation as very often an implementation will not be 100% consistent with the design. For instance, if a design calls for a configurablenumber (max number undefined) of similarly configured read databases, to which allread transactions can be evenly distributed, and the software engineer implements asystem capable of handling up to five read databases, he or she has implemented asystem with a defined scale limit. Here, defined scale limit is the limitation the engineer put on how many databases can be implemented (five). 软件工程师比任何其他角色都更要对系统实施过程中他所在部分的可扩展性负责。在这里,我们指出设计和实现之间的差异,因为实现通常不会与设计 100% 一致。例如,如果设计需要可配置数量(最大数量未定义)的类似配置的读取数据库,所有读取事务可以均匀分布到其中,并且软件工程师实现了一个能够处理最多五个读取数据库的系统,那么他或她已经实现了具有明确规模限制的系统。在这里,定义的规模限制是工程师对可以实现的数据库数量(五个)的限制。 A software engineer should understand the portion of the system that she supports, maintains, or for which she creates code. He should also understand the enduser and how the end user interacts with the software engineer’s portion of the system. The software engineer is a contributor to many of the scalability processes wedefine later in Part II. 软件工程师应该了解她支持、维护或为其创建代码的系统部分。他还应该了解最终用户以及最终用户如何与软件工程师的系统部分进行交互。软件工程师是我们稍后在第二部分中定义的许多可扩展性流程的贡献者。 ##### Operator 操作员 The operator is responsible for handling the daily operations of the production system, whether that system is a Web 2.0 system or a back office IT system. She isresponsible for monitoring the system against specific service levels, monitoring forout of bounds conditions, alerting individuals based on service level or boundarycondition failures, and tracking incidents to closure. A form of operator, sometimescalled an incident manager, is responsible for managing major problems to closureand issuing root cause and corrective action reports. 操作员负责处理生产系统的日常运营,无论该系统是Web 2.0系统还是后台IT系统。她负责针对特定服务级别监控系统,监控超出范围的情况,根据服务级别或边界条件故障向个人发出警报,并跟踪事件直至关闭。一种形式的操作员(有时称为事件经理)负责管理主要问题直至解决并发布根本原因和纠正措施报告。 ##### Infrastructure Engineer 基础设施工程师 Infrastructure engineer is a generic term used to identify database administrators,network engineers, and systems administration professionals. The infrastructureengineer is responsible for the selection, configuration, implementation, tuning, andproper functioning of the devices or systems under his purview. 基础设施工程师是一个通用术语,用于识别数据库管理员、网络工程师和系统管理专业人员。基础设施工程师负责其职权范围内的设备或系统的选择、配置、实施、调整和正常运行。 The infrastructure engineer, more than any other role, is responsible for the scalability of the systems that he supports. As such, a database analyst is responsible foridentifying early when his database is going to fail based on capacity constraints andto identify potential opportunities for scaling. A systems analyst is expected to do thesame for her systems and storage and a network engineer for the portions of the network that she supports. 与其他任何角色相比,基础设施工程师对他所支持的系统的可扩展性负有更大的责任。因此,数据库分析师负责根据容量限制尽早确定数据库何时会发生故障,并确定潜在的扩展机会。系统分析师应该为她的系统和存储做同样的事情,网络工程师应该为她支持的网络部分做同样的事情。 In addition to having a deep technical understanding of his specific discipline, askilled infrastructure engineer should understand the product he helps to support, beconversant in the “sister” disciplines within the hardware and systems community (agreat systems administrator for instance should have a basic understanding of networks and a good understanding of how to troubleshoot basic database problems) inorder to aid in troubleshooting, a good knowledge of competing technologies tothose employed in his product or platform, and a good understanding of emergingtechnologies within his field. The infrastructure engineer should also understand the cost of operating his system and the opportunities to reduce that cost overtime.Finally, the best infrastructure engineers are agnostic to the technologies they employ,a point we will cover in Chapter 20, Designing for Any Technology. 除了对他的特定学科有深入的技术理解之外,熟练的基础设施工程师还应该了解他帮助支持的产品,熟悉硬件和系统社区中的“姐妹”学科(例如,优秀的系统管理员应该对以下领域有基本的了解)网络并充分了解如何解决基本数据库问题)以帮助进行故障排除,充分了解其产品或平台中使用的竞争技术,并充分了解其领域内的新兴技术。基础设施工程师还应该了解操作系统的成本以及随着时间的推移降低成本的机会。最后,最好的基础设施工程师对他们所使用的技术是不可知的,这一点我们将在第 20 章“针对任何技术进行设计”中介绍。 ##### QA Engineer/Analyst 质量保证工程师/分析师 The QA engineer or analyst is responsible for testing the application and the systemsinfrastructure to ensure that it meets the product specifications. A portion of her timeshould be dedicated to performance testing as it relates to scalability and as definedin Chapter 17, Performance and Stress Testing. QA 工程师或分析师负责测试应用程序和系统基础设施,以确保其符合产品规格。她的一部分时间应该专门用于性能测试,因为它与可扩展性相关,并且如第 17 章“性能和压力测试”中所定义。 ##### Capacity Planner 容量规划器 We’ve discussed the role and activity of the capacity planner in earlier sections of thischapter. Put simply, the capacity planner is responsible for matching the expecteddemand (typically generated by the business unit) to the current system as implemented to determine where additional changes need to be made in the system, platform, or product. The capacity planner is not responsible for defining what thesechanges are; rather, she outlines where changes need to occur. 我们在本章前面的部分中讨论了容量规划者的角色和活动。简而言之,容量规划人员负责将预期需求(通常由业务部门生成)与当前实施的系统相匹配,以确定需要在系统、平台或产品中进行哪些额外更改。容量规划者不负责定义这些更改是什么;相反,她概述了需要发生变化的地方。 In the case where a change needs to be made to a system that scales horizontally,the capacity planner may have as part of her job description the responsibility to helpkick off the purchase order process to bring in new equipment. More often than not,the capacity planner is also a critical part of the process of budgeting for new systemsand new initiatives to meet the business forecasted demand. 如果需要对水平扩展的系统进行更改,作为其工作描述的一部分,容量规划人员可能有责任帮助启动采购订单流程以引入新设备。通常,容量规划器也是新系统和新计划预算流程的关键部分,以满足业务预测需求。 #### An Organizational Example 组织示例 The new CEO of AllScale analyzes her team over the first 90 days. The company hashad a number of scalability related incidents with its flagship HRM product andChristine determines that the current CTO (in AllScale’s case, the CTO is the highesttechnology management position in the company) simply isn’t capable of handlingthe development of new functionality and the stabilization of the existing platform.Christine believes that one of the issues with the executive previously in charge oftechnology was that he really had no business acumen and could not properlyexplain the need for certain purchases or projects in business terms. The former CTOsimply did not understand simple business concepts like returns on investment anddiscounted cash flow. Furthermore, he always expected the business folks to understand the need for any of what business peers believed were his pet projects andwould simply say, “We either do this or we will die.” Although the technology team’sbudget was nearly 20% of the company’s $200 million in revenue, systems still failed and the old CTO would blame unfunded projects for outages and then blame thebusiness people for not understanding technology AllScale 的新任首席执行官在前 90 天内分析了她的团队。该公司的旗舰 HRM 产品发生了许多与可扩展性相关的事件,Christine 认为现任 CTO(在 AllScale 的案例中,CTO 是公司最高的技术管理职位)根本没有能力处理新功能的开发和稳定性问题。 Christine认为,之前负责技术的高管的问题之一是他确实没有商业头脑,无法用商业术语正确解释某些采购或项目的需求。这位前首席技术官根本不理解投资回报和贴现现金流等简单的商业概念。此外,他总是希望商界人士能够理解商界同行认为他喜欢的项目的必要性,并且会简单地说:“我们要么这样做,要么我们就会死。”尽管技术团队的预算占公司 2 亿美元收入的近 20%,但系统仍然出现故障,老 CTO 会将停机归咎于没有资金支持的项目,然后归咎于业务人员不懂技术 The CEO sits down with her new CTO, a person she picked from an array of candidates with graduate degrees in both business and electrical engineering or computerscience, and explains that while she will delegate the responsibility for technical decisions to the CTO and empower him to make decisions within his budget limitations,she will not and cannot delegate the accountability for his results. She explains thatshe wants to create a culture of scalability in the company along the lines of the oldmanufacturing mottos of “everyone is accountable for quality.” She will work to adda nod toward scalability in the corporate vision and add a corporate belief surrounding the need to cost effectively scale to customer demands without quality of serviceor availability (a.k.a. downtime) problems. 首席执行官与她的新任首席技术官(她是从一系列拥有商业和电气工程或计算机科学研究生学位的候选人中挑选出来的人)坐下来,并解释说,虽然她将把技术决策的责任委托给首席技术官,并授权他做出在他的预算限制内做出决定,她不会也不能将对其结果的责任委托给他人。她解释说,她希望按照“每个人都对质量负责”这一古老的制造业座右铭,在公司创造一种可扩展的文化。她将努力在企业愿景中增加对可扩展性的认可,并增加企业信念,即需要以成本有效的方式扩展以满足客户需求,而不会出现服务质量或可用性(又称停机)问题。 The new CTO, Johnny Fixer, asks for 30 days to review the organization, identify,and put in motion some quick win projects and report back with a plan to make thetechnology platform, organization, and processes highly scalable and highly available. He promises to keep Christine informed and communicate the issues he findsand concerns he has. They agree to talk daily on the phone, exchange emails moreoften, and meet personally at least once a week. 新任 CTO Johnny Fixer 要求用 30 天的时间来审查组织、确定并启动一些快速获胜项目,并报告计划,以使技术平台、组织和流程具有高度可扩展性和高可用性。他承诺让克里斯汀随时了解情况并传达他发现和担心的问题。他们同意每天通过电话交谈,更频繁地交换电子邮件,并且每周至少会面一次。 Johnny quickly identifies overlaps in jobs in certain areas and responsibilities thatare completely missing from his team. For instance, no one is responsible for developing a single cohesive capacity plan. Furthermore, teams do not work together to collaborate on designs, architects are not engaged with the engineering teams and do notunderstand the current status of customer grief with the product, and quality defectsare blamed on a QA team with no engineering ownership of bugs. 约翰尼很快就发现了某些领域的工作重叠以及他的团队完全缺失的职责。例如,没有人负责制定单一的有凝聚力的能力计划。此外,团队不会共同协作进行设计,架构师不会与工程团队合作,也不了解客户对产品的现状,质量缺陷会归咎于没有工程所有权的 QA 团队。 Johnny works quickly to hire a capacity planner onto his team. As it is May andthe company’s budgeting for the next year must be complete by October, he knows hemust get good data about current system performance relative to peak theoreticalcapacity and start to get next year’s demand projections from the business to help theCFO create his next fiscal year budget. The newly hired capacity planner starts working with the engineering team to install the appropriate monitoring systems to collectsystem data in order to identify capacity bottle necks and she works with finance tounderstand both the current budget and to help provide information to generate thenext year’s budget. 约翰尼很快就聘请了一名容量规划师加入他的团队。由于现在是 5 月,公司明年的预算必须在 10 月之前完成,他知道他必须获得有关当前系统性能相对于理论峰值容量的良好数据,并开始从业务部门获得明年的需求预测,以帮助 CFO 制定下一个财年预算。新聘用的容量规划师开始与工程团队合作安装适当的监控系统来收集系统数据,以便识别容量瓶颈,她与财务人员合作了解当前预算并帮助提供信息以生成明年的预算。 Although the CTO is worried about all of his technology problems, he knows thatlong term he is going to have to focus his teams on how they can work together andcreate shareholder value. He implements a tool for defining roles and responsibilitiescalled RASCI for Responsible, Accountable, Supportive, Consulted, and Informed(this tool is defined further in the next section) and implements Joint ArchitectureDesign and the Architecture Review Board (defined in Chapters 13 and 14) to helpresolve the lack of cooperation between organizations. 尽管首席技术官担心他的所有技术问题,但他知道从长远来看,他必须让他的团队专注于如何合作并创造股东价值。他实现了一个用于定义角色和职责的工具,称为 RASCI,即负责、负责、支持、咨询和知情(该工具将在下一节中进一步定义),并实现联合架构设计和架构审查委员会(在第 13 章和第 14 章中定义)来帮助解决组织之间缺乏合作的问题。 Johnny walks through the past 30 days of issues and identifies that the team is notkeeping track of outages, incidents, and their associated impact to the business. Hemakes his head of technical operations responsible for all outage tracking and indicates that together they will review all issues daily and track them to closure. He further requires that all architects attend at least one of the daily operations meetingsper month to help get them closer to the customer and to better understand the painsassociated with the current system. While meeting with his engineering managers,Johnny indicates that all bugs will be considered engineering and QA failures ratherthan just QA failure and that the company will begin tracking defects (or bugs) perfeature produced with a goal to reducing all failures. Johnny 回顾了过去 30 天的问题,并发现团队没有跟踪中断、事件及其对业务的相关影响。他让技术运营主管负责所有中断跟踪,并表示他们将一起每天审查所有问题并跟踪问题直至解决。他进一步要求所有架构师每月至少参加一次日常运营会议,以帮助他们更接近客户并更好地理解与当前系统相关的难题。在与工程经理会面时,约翰尼表示所有错误都将被视为工程和 QA 失败,而不仅仅是 QA 失败,并且公司将开始跟踪每个功能产生的缺陷(或错误),目标是减少所有失败。 To help align his teams to the need for a more reliable and available site, Johnnyimplements a site uptime or availability metric and a goal to achieve greater than99.99% availability by month within the next four months. With the CEO’s adviceand permission, and with the help of his architects, engineers, and infrastructure engineers, he reprioritizes some projects to attack the site outage incidents that appear(given the small amount of data) to have caused the most grief to the company. 为了帮助他的团队满足更可靠、更可用的站点的需求,Johnny 实施了站点正常运行时间或可用性指标,并制定了在接下来的四个月内每月实现超过 99.99% 可用性的目标。在首席执行官的建议和许可下,并在建筑师、工程师和基础设施工程师的帮助下,他重新安排了一些项目的优先顺序,以应对似乎(考虑到少量数据)给公司造成最大损失的站点中断事件。 Johnny then implements a governance council for all engineering projects consisting of the CEO, the CFO, and all of the business unit leaders. The council is responsible for prioritizing projects, including availability projects, and for additionallymeasuring their returns against the promised success and business metrics uponwhich they were based. 然后,约翰尼为所有工程项目建立一个治理委员会,由首席执行官、首席财务官和所有业务部门领导组成。该委员会负责确定项目(包括可用性项目)的优先级,并根据承诺的成功和所依据的业务指标额外衡量其回报。 After the first 30 days, Johnny covers his 30-, 60-, and 90-day forward plans withthe CEO and they jointly agree on a vision and set of goals for the engineering team(see Chapter 4, Leadership 101). Christine then has an “all hands” meeting with theentire company explaining that scalability and availability of the platform are of theutmost priority and that it is “everyone’s job” to help ensure that the company andits services scale to meet customer demands. To help incent the company toward anappropriate culture that includes the notion of being “highly scalable,” she insiststhat all managers have as part of their bonus compensation a scalability related goalthat represents no less than 5% of their bonus. She delegates the development ofthose goals to her subordinates and asks to review them in the next 30 days. 前 30 天后,约翰尼与首席执行官讨论了他的 30 天、60 天和 90 天的远期计划,他们共同就工程团队的愿景和目标达成一致(请参阅第 4 章,领导力 101)。随后,Christine 与整个公司召开了一次“全体会议”,解释说平台的可扩展性和可用性是重中之重,帮助确保公司及其服务的扩展以满足客户需求是“每个人的工作”。为了帮助激励公司建立包括“高度可扩展性”概念在内的适当文化,她坚持认为所有经理都应将与可扩展性相关的目标作为奖金的一部分,该目标不少于奖金的 5%。她将这些目标的制定委托给下属,并要求在接下来的 30 天内进行审查。 #### A Tool for Defining Responsibilities 定义职责的工具 Many of our clients use a simple tool to help them define role clarity for any giveninitiative. Often when we are brought in to help with scalability in a company, weemploy this tool to define who should do what, and to help eliminate wasted workand ensure complete coverage of all scalability related needs. Although technically aprocess, as this is a chapter on roles and responsibility, we felt compelled to includethis tool here. 我们的许多客户使用简单的工具来帮助他们为任何给定的计划定义角色清晰度。通常,当我们被请来帮助公司实现可扩展性时,我们会使用此工具来定义谁应该做什么,并帮助消除浪费的工作并确保完全覆盖所有可扩展性相关的需求。尽管从技术上讲是一个过程,但由于这是关于角色和责任的章节,我们觉得有必要在此处包含此工具。 The tool we most often use is called RASCI. It is a responsibility assignment chartand the acronym stands for Responsible, Accountable, Supportive, Consulted, andInformed. 我们最常使用的工具称为 RASCI。它是一份责任分配表,缩写代表“负责”、“负责”、“支持”、“咨询”和“知情”。 * R stands for Responsible. This is the person responsible for completing theproject or initiative. * A stands for Accountable. This is the person to whom R is accountable and whomust approve the work before it is okay to complete. The A is sometimesreferred to as the approver of any initiative. * S stands for Supportive. These people provide resources to complete the projector initiative. * C stands for Consulted. These people have data or information that can be useful in completing the project. * I stands for Informed. These people should be notified, but do not need to beconsulted or provide input to the project. R代表负责任。这是负责完成项目或计划的人。 * A 代表负责。这是 R 对其负责的人,并且必须在工作可以完成之前批准该人。 A 有时被称为任何计划的批准者。 * S 代表支持。这些人提供资源来完成投影仪计划。 * C 代表已咨询。这些人拥有对完成项目有用的数据或信息。 * I 代表知情者。应通知这些人,但不需要咨询这些人或为项目提供意见。 RASCI can be used in a matrix, where each activity or initiative is spelled out alongthe y or vertical axis of the matrix and the individual contributors or organizationsare spelled out on the x-axis of the matrix. The intersection of the activity (y-axis)and the organization (x-axis) contains one of the letters R, A, S, C, or I and mayinclude nothing if that individual or organization is not part of the initiative. RASCI 可以在矩阵中使用,其中每个活动或倡议都沿矩阵的 y 轴或垂直轴详细说明,而各个贡献者或组织则在矩阵的 x 轴上详细说明。活动(y 轴)和组织(x 轴)的交集包含字母 R、A、S、C 或 I 之一,并且如果该个人或组织不是该计划的一部分,则可能不包含任何内容。 Ideally, in any case, there will be a single R and a single A for any given initiative.This helps eliminate the issue we identified earlier in this chapter of having multipleorganizations or individuals feeling that they are responsible for any given initiative.By having a single person or organization responsible, you are abiding by the “oneback to pat and one throat to choke” rule. A gentler way of saying this is that distributed ownership is ownership by no one. 理想情况下,在任何情况下,任何给定的计划都会有一个 R 和一个 A。这有助于消除我们在本章前面指出的问题,即让多个组织或个人感觉他们对任何给定的计划负责。个人或单位负责,遵守“一拍一喉”的规则。一种更温和的说法是,分布式所有权不属于任何人。 This is not to say that others should not be allowed to provide input to the projector initiative. The RASCI model clearly allows and enforces the use of consultants orpeople within and outside your company who might add value to the initiative. An Ashould not sign off on an R’s approach until such time as the R has actually consultedwith all of the appropriate people to develop the right course of action. And of courseif the company has the right culture, not only is the R going to want to seek thosepeople’s help, but the R is going to make them feel as if their input is valued andvalue added to the decision making process. 这并不是说其他人不应该被允许为投影仪倡议提供输入。 RASCI 模型明确允许并强制使用公司内外可能为该计划增加价值的顾问或人员。在 R 实际与所有合适的人员协商以制定正确的行动方案之前,不应批准 R 的方法。当然,如果公司拥有正确的文化,R 不仅会想要寻求这些人的帮助,而且还会让他们感觉自己的意见受到重视,并且为决策过程带来了附加值。 You can add as many Cs, Ss, and Is as you would like and as add value or areneeded to complete any given project. That said, protect against going overboardregarding who exactly you will inform. Remember our discussion in the previouschapter about people being bogged down with email and communication that doesnot concern them. It is common in young companies to allow everyone to feel that they should be involved in every decision or informed of every decision. This information distribution mechanism simply does not scale and results in people readingemails rather than doing what they should be doing to create shareholder value. 您可以根据需要添加任意数量的 C、S 和 Is,并添加价值或完成任何给定项目所需的数量。也就是说,不要过度考虑你到底要通知谁。请记住我们在上一章中讨论的关于人们陷入与他们无关的电子邮件和通信的困境。在年轻的公司中,让每个人都觉得他们应该参与每一个决策或了解每一个决策是很常见的。这种信息分配机制根本无法扩展,导致人们阅读电子邮件而不是做他们应该做的事情来创造股东价值。 A partially filled out example matrix is included in Table 2.1. 表 2.1 中包含部分填写的示例矩阵。 ![](https://blog.baidu-google.com/usr/uploads/2024/06/3906488006.png) Taking some of our discussion thus far regarding different roles, let’s see howwe’ve begun to fill out this RASCI matrix. 通过迄今为止我们关于不同角色的一些讨论,让我们看看我们如何开始填写这个 RASCI 矩阵。 We earlier indicated that the CEO absolutely must be responsible for the culture ofscalability, or the scale DNA of the company. Although it is theoretically possible forher to delegate this responsibility to someone else within the company from a practical perspective, and as you will see in the chapter on leadership, she must live andwalk the values associated with scaling the company and its platform. As such, evenwith delegation and as we are talking about how the company “acts” with respect toscale, the CEO absolutely must “own” this. Therefore, we have placed an R in theCEO’s column next to the Scalability Culture initiative row. The CEO is obviouslyresponsible to the board of directors and, as the creation of scale culture has to do withoverall culture creation, we have indicated that the board of directors is the A. 我们之前指出,首席执行官绝对必须对可扩展性文化或公司的规模 DNA 负责。尽管理论上她可以从实践角度将这一责任委托给公司内的其他人,而且正如您将在有关领导力的章节中看到的那样,但她必须践行与扩展公司及其平台相关的价值观。因此,即使有授权,当我们谈论公司如何在规模方面“行事”时,首席执行官也绝对必须“拥有”这一点。因此,我们在 CEO 列中的可扩展性文化倡议行旁边放置了一个 R。显然,首席执行官对董事会负责,而规模文化的创建与整体文化的创建有关,因此我们表示董事会是A。 Who are the Ss of the Scalability Culture initiative? Who should be informed andwho needs to be consulted? In developing your answer to this question, you areallowed to have people who are Ss of any situation also be Cs in the development ofthe solution. It is implied that Cs and Ss will be informed as a result of their jobs, soit is generally not necessary to include an I any place that you feel you need to communicate a decision and a result. 谁是可扩展性文化计划的 Ss?谁应该被告知以及需要咨询谁?在制定此问题的答案时,您可以让任何情况下的 S 级人员在制定解决方案时同时担任 C 级人员。这意味着 Cs 和 Ss 将因其工作结果而被告知,因此通常没有必要在您认为需要传达决定和结果的任何地方包含“I”。 We’ve also completely filled out the row for Technical Scalability Vision. Here, aswe’ve previously indicated, the CTO is responsible for developing the vision for scalability for the product/platform/system. The CTO’s boss is very likely the CEO, soshe will be responsible for approving the decision or course. Note that it is not absolutely necessary that the R’s boss be the A in any given decision. It is entirely possiblethat the R will be performing actions on behalf of someone for whom he or she doesnot work. In this case, however, assuming that the CTO works for the CEO, there isvery little chance that the CTO would actually have someone other than the CEOapprove his or her scalability vision or scalability plan. 我们还完全填写了技术可扩展性愿景行。正如我们之前指出的,首席技术官负责制定产品/平台/系统可扩展性的愿景。 CTO 的老板很可能就是 CEO,因此她将负责批准决策或过程。请注意,在任何给定的决策中,R 的老板不一定是 A。 R 完全有可能代表他或她不为之工作的某人执行操作。然而,在这种情况下,假设 CTO 为 CEO 工作,那么 CTO 实际上让 CEO 以外的人批准他或她的可扩展性愿景或可扩展性计划的可能性很小。 Consultants to the scalability vision are the CTO’s peers—the people who rely onthe CTO for either the availability of the product or the back office systems that runthe company. These people need to be consulted because the systems that the CTOcreates and runs are the lifeblood of the business units and the heart of the backoffice systems that the CFO needs to do his or her job. 可扩展性愿景的顾问是 CTO 的同事,他们依赖 CTO 来确保产品的可用性或运行公司的后台系统。需要咨询这些人的意见,因为 CTO 创建和运行的系统是业务部门的命脉,也是 CFO 完成其工作所需的后台系统的核心。 We have indicated that the CTO’s organizations (Architecture group, Engineeringteam, Operations team, Infrastructure team, and QA) are all supporters of the vision,but one or more of them could also be consultants to the solution. The less technicalthe CTO, the more he will need to rely upon his teams to develop the vision for scalability. Here, we have assumed that the CTO has the greatest technical experience onthe team, which is obviously not always the case. The CTO may also want to bring inoutside help in determining the scalability vision and/or plan. This outside help maybe a retained advisory services firm or potentially the establishment of a technologyadvisory and governance board that provides for the technology team the same governance and oversight that a board of directors provides at a corporate level. 我们已经指出,CTO 的组织(架构组、工程团队、运营团队、基础设施团队和 QA)都是该愿景的支持者,但其中一个或多个也可以是该解决方案的顾问。 CTO 的技术越少,他就越需要依赖他的团队来制定可扩展性的愿景。在这里,我们假设 CTO 拥有团队中最丰富的技术经验,但显然情况并非总是如此。 CTO 可能还希望引入外部帮助来确定可扩展性愿景和/或计划。这种外部帮助可能是一家保留的咨询服务公司,也可能是建立一个技术咨询和治理委员会,为技术团队提供与董事会在公司层面提供的相同的治理和监督。 Finally, we have indicated that the board of directors needs to be Informed of thescalability vision. This might be a footnote in a board meeting or a discussion aroundwhat is possible with the current platform and how the company will need to investto meet the scalability objectives for the coming years. 最后,我们指出董事会需要了解可扩展性愿景。这可能是董事会会议中的脚注,或围绕当前平台的可能性以及公司需要如何投资才能实现未来几年的可扩展性目标的讨论。 The remainder of the matrix has been partially filled out. Important points withrespect to the matrix are that we have split up the tasks/initiatives to try to ensurethat there aren’t any overlaps in the R category. For instance, the responsibility forinfrastructure tasks has been split from the responsibility for software developmentor architecture and design tasks. This allows for clear responsibility in line with our“one back to pat and one throat to choke” philosophy. In so doing, however, the organization might tend to move toward designing in a silo or vacuum, which iscounter to what you would like to have long term. Should you structure your organization in a similar fashion, it is very important that you implement processes thatrequire teams to design together to create the best possible solution. Matrix organized teams do not suffer from some of the silo mentality that exists within teamsbuilt in silos around functions or organizational responsibility, but they can still benefit from RASCI. You should still have a single responsible organization; but youwant to ensure that collaboration happens. RASCI helps enforce that through the useof the C attribute. 矩阵的其余部分已部分填写。关于矩阵的重要一点是,我们已经拆分了任务/计划,以尽量确保 R 类别中没有任何重叠。例如,基础设施任务的责任已从软件开发或架构和设计任务的责任中分离出来。这使得责任明确,符合我们“一拍一拍、一掐喉”的理念。然而,这样做时,组织可能会倾向于在孤岛或真空中进行设计,这与您希望的长期目标背道而驰。如果您以类似的方式构建组织,那么实施需要团队共同设计以创建最佳解决方案的流程非常重要。矩阵组织的团队不会受到围绕职能或组织责任而建立的团队中存在的一些筒仓心态的影响,但他们仍然可以从 RASCI 中受益。您仍然应该有一个负责的组织;但您想确保协作发生。 RASCI 通过使用 C 属性来帮助强制执行这一点。 Please spend time working through the rest of the matrix in Table 2.1 to get comfortable with the RASCI model. It is a very effective tool in clearly defining roles andresponsibilities and can help eliminate duplicated work, unfortunate morale-deflatingfights, and missed work assignments. 请花时间完成表 2.1 中矩阵的其余部分,以适应 RASCI 模型。它是明确定义角色和职责的非常有效的工具,可以帮助消除重复工作、不幸的士气低落的战斗和错过的工作任务。 #### Conclusion 结论 Providing role clarity is the responsibility of leaders and managers. Individuals aswell as organizations need role clarity. We provided some examples of how rolesmight be clearly defined to help in the organization’s mission of attaining higheravailability. We also argued that these are but one of many examples that might becreated regarding individuals and organizations and their roles. The real answer foryou may vary significantly as the roles should be developed consistent with companyculture and need. In attempting to create role clarity, attempt to stay away from overlapping responsibilities, as these can create wasted effort and value-destroying conflicts. Also attempt to ensure that no areas are missing, as these will result in failures. 明确角色是领导者和管理者的责任。个人和组织都需要明确角色。我们提供了一些示例,说明如何明确定义角色以帮助组织实现更高可用性的使命。我们还认为,这些只是可能针对个人和组织及其角色创建的众多示例之一。您的真正答案可能会有很大差异,因为角色的开发应符合公司文化和需求。在试图明确角色时,应尽量避免职责重叠,因为这可能会造成浪费精力和破坏价值的冲突。还要尝试确保没有遗漏任何区域,因为这将导致失败。 We also introduced a tool called RASCI to help define roles and responsibilitieswithin the organization. Feel free to use RASCI for your own organizational rolesand for roles within initiatives. The use of RASCI can help eliminate duplicated workand make your organization more effective, efficient, and scalable. 我们还引入了一个名为 RASCI 的工具来帮助定义组织内的角色和职责。请随意将 RASCI 用于您自己的组织角色以及计划中的角色。使用 RASCI 可以帮助消除重复工作,并使您的组织更加有效、高效和可扩展。 ##### Key Points 关键点 * Role clarity is critical for scale initiatives to be successful. * Overlapping responsibility creates wasted effort and value-destroying conflicts. * Areas missing responsibility create vacuums of activity and failed scale initiatives. * The CEO is the chief scalability officer of the company. * The CTO/CIO is the chief technical scale officer of the company. * Key scale related responsibilities for any organization include 角色清晰对于规模计划的成功至关重要。 * 责任重叠会造成精力浪费和破坏价值的冲突。 * 职责缺失的领域会造成活动真空和规模计划失败。 * 首席执行官是公司的首席可扩展官。 * CTO/CIO是公司的首席技术官。 * 任何组织的关键规模相关职责包括 + Creation of the scalability vision for the organization + Setting measurable scale related goals + Staffing the team with the appropriate skill sets necessary to meet the scalability objectives + Defining a scalable architecture + Implementing that architecture in systems and code + Testing the implementation against current and future user demand + Gathering data on current platform and product utilization to determineimmediate needs for scale + Developing future demand projections and converting that demand projectioninto meaningful system demand + Analyzing the demand projections against the system to determine wherechanges are needed + Defining future changes based on the analysis + Developing processes to determine when and where systems will break andprioritizing fixes for those issues * 为组织创建可扩展性愿景 + 设置可衡量的规模相关目标 + 为团队配备满足可扩展性目标所需的适当技能 + 定义可扩展架构 + 在系统和代码中实现该架构 + 根据当前和未来测试实现用户需求 + 收集当前平台和产品利用率的数据,以确定规模的直接需求 + 制定未来需求预测并将该需求预测转换为有意义的系统需求 + 根据系统分析需求预测,以确定需要进行哪些更改 + 根据分析定义未来更改 + 开发流程来确定系统何时何地会出现故障,并确定这些问题的修复优先顺序 * RASCI is a tool that can help eliminate overlaps in responsibility and createclear role definition. RASCI is developed in a matrix in which + R stands for the person Responsible for deciding what to do and running tive. + A is Accountable or the Approver of the initiative and the results. + S stands for Supportive, referring to anyone providing services to accomplish theinitiative. + C stands for those who should be Consulted before making a decision andregarding the results of the initiative. + I stands for those who should be Informed of both the decision and the results. RASCI 是一种工具,可以帮助消除职责重叠并创建清晰的角色定义。 RASCI 是在一个矩阵中开发的,其中 + R 代表负责决定做什么和运行活动的人。 + A 是负责人或计划和结果的批准者。 + S 代表支持者,指的是为以下目标提供服务的任何人 + C 代表在做出决定之前就该计划的结果应该咨询的人。 + I 代表应该被告知决定和结果的人。
没有评论