zl程序教程

您现在的位置是:首页 >  IT要闻

当前栏目

机器人相关学术速递[11.9]

2023-03-14 22:51:47 时间

cs.RO机器人相关,共计30篇

【1】 A simple and versatile topology optimization formulation for flexure synthesis 标题:一种简单易行的柔性综合拓扑优化公式 链接:https://arxiv.org/abs/2111.04620

作者:Stijn Koppen,Matthijs Langelaar,Fred van Keulen 机构:Department of Precision and Microsystems Engineering, Delft University of Technology 备注:For associated video of prototypes see this https URL&t=24s&ab_channel=artofscience 摘要:高科技设备在很大程度上依赖于用于精确操作和测量的挠曲装置。通过弹性变形,挠曲在其自由度的有限运动范围内提供极限位置重复性,同时在约束度内约束运动。拓扑优化证明了短行程柔性设计的一个潜在工具,提供了最大的设计自由度,并允许应用特定的要求。用于挠曲合成的最新拓扑优化公式面临易用性、多功能性、实现复杂性和计算成本等挑战,缺少普遍接受的公式。本研究提出了一种新的拓扑优化公式,用于在指定位移场景下,基于应变能测量的短行程弯曲综合。由此产生的自伴随优化问题类似于经典的遵从性最小化,并继承了类似的实现简单性、计算效率和收敛特性。数值例子证明了弯曲类型的多功能性和附加设计要求的可扩展性。提供的源代码鼓励在学术界和工业界探索和应用该公式。 摘要:High-tech equipment critically relies on flexures for precise manipulation and measurement. Through elastic deformation, flexures offer extreme position repeatability within a limited range of motion in their degrees of freedom, while constraining motion in the degrees of constraint. Topology optimization proves a prospective tool for the design of short-stroke flexures, providing maximum design freedom and allowing for application-specific requirements. State-of-the-art topology optimization formulations for flexure synthesis are subject to challenges like ease of use, versatility, implementation complexity, and computational cost, leaving a generally accepted formulation absent. This study proposes a novel topology optimization formulation for the synthesis of short-stroke flexures uniquely based on strain energy measures under prescribed displacement scenarios. The resulting self-adjoint optimization problem resembles great similarity to classic compliance minimization and inherits similar implementation simplicity, computational efficiency, and convergence properties. Numerical examples demonstrate the versatility in flexure types and the extendability of additional design requirements. The provided source code encourages the formulation to be explored and applied in academia and industry.

【2】 CoCo Games: Graphical Game-Theoretic Swarm Control for Communication-Aware Coverage 标题:可可博弈:通信感知覆盖的图形化博弈理论群控制 链接:https://arxiv.org/abs/2111.04576

作者:Malintha Fernando,Ransalu Senanayake,Martin Swany 机构: thus re-ducing the communication overhead incurred by the controlMalintha Fernando and Martin Swany are with the Luddy SchoolofInformatics, andEngineeringatIndianaUniversity, Ransalu Senanayake is with StanfordUniversity 备注:8 pages, 7 figures 摘要:我们提出了一种新的方法,最大限度地提高机器人在大规模感兴趣地理区域(ROI)上的通信感知覆盖率。我们的方法在邻域选择和控制方面补充了底层网络拓扑,使其在动态环境中具有高度鲁棒性。我们将覆盖率描述为一个多阶段的合作图形博弈,并使用变分推理(VI)来达到均衡。我们在使用无人机(UAV)和用户设备(UE)机器人的移动ad-hoc无线网络场景中对我们的方法进行了实验验证。我们表明,它可以满足现实网络条件下由静止和移动用户设备(UE)机器人定义的ROI。 摘要:We present a novel approach to maximize the communication-aware coverage for robots operating over large-scale geographical regions of interest (ROIs). Our approach complements the underlying network topology in neighborhood selection and control, rendering it highly robust in dynamic environments. We formulate the coverage as a multi-stage, cooperative graphical game and employ Variational Inference (VI) to reach the equilibrium. We experimentally validate our approach in an mobile ad-hoc wireless network scenario using Unmanned Aerial Vehicles (UAV) and User Equipment (UE) robots. We show that it can cater to ROIs defined by stationary and moving User Equipment (UE) robots under realistic network conditions.

【3】 Wrapped Haptic Display for Communicating Physical Robot Learning 标题:用于物理机器人学习交流的包裹式触觉显示器 链接:https://arxiv.org/abs/2111.04542

作者:Antonio Alvarez Valdivia,Ritish Shailly,Naman Seth,Francesco Fuentes,Dylan P. Losey,Laura H. Blumenschein 机构: 1School of Mechanical Engineering, Purdue University 备注:8 pages, 8 figures, 1 table 摘要:人与机器人之间的物理交互可以帮助机器人学习执行复杂任务。机器人手臂通过观察人类在整个任务中的动觉引导来获取信息。虽然之前的工作主要关注机器人如何学习,但同样重要的是,这种学习对人类教师来说是透明的。显示机器人不确定性的视觉显示可能传达该信息;然而,我们假设视觉反馈机制忽略了人与机器人之间的物理联系。在这项工作中,我们提出了一种软触觉显示,它环绕并符合机器人手臂的表面,在现有接触点添加触觉信号,而不会显著影响交互。我们演示了软驱动如何在保持设备安装灵活性的同时产生显著的触觉信号。通过一项心理物理实验,我们发现用户可以准确区分包裹式显示器的通货膨胀水平,平均韦伯分数为11.4%。当我们将包装好的显示器放置在机器人机械手的手臂上时,用户能够在机器人学习任务样本中解释和利用触觉信号,从而改进对机器人需要更多训练的区域的识别,并使用户能够提供更好的演示。在此处查看我们的设备和用户研究视频:https://youtu.be/tX-2Tqeb9Nw 摘要:Physical interaction between humans and robots can help robots learn to perform complex tasks. The robot arm gains information by observing how the human kinesthetically guides it throughout the task. While prior works focus on how the robot learns, it is equally important that this learning is transparent to the human teacher. Visual displays that show the robot's uncertainty can potentially communicate this information; however, we hypothesize that visual feedback mechanisms miss out on the physical connection between the human and robot. In this work we present a soft haptic display that wraps around and conforms to the surface of a robot arm, adding a haptic signal at an existing point of contact without significantly affecting the interaction. We demonstrate how soft actuation creates a salient haptic signal while still allowing flexibility in device mounting. Using a psychophysics experiment, we show that users can accurately distinguish inflation levels of the wrapped display with an average Weber fraction of 11.4%. When we place the wrapped display around the arm of a robotic manipulator, users are able to interpret and leverage the haptic signal in sample robot learning tasks, improving identification of areas where the robot needs more training and enabling the user to provide better demonstrations. See videos of our device and user studies here: https://youtu.be/tX-2Tqeb9Nw

【4】 D-Flow: A Real Time Spatial Temporal Model for Target Area Segmentation 标题:D-FLOW:一种用于目标区域分割的实时时空模型 链接:https://arxiv.org/abs/2111.04525

作者:Wentao Lu,Claude Sammut 机构:School of Computer Science and Engineering, The University of New South Wales 摘要:近年来,语义分割引起了人们的广泛关注。在机器人技术中,分割可用于识别感兴趣的区域,或emph{target area}。例如,在RoboCup标准平台联盟(SPL)中,分割将足球场与背景以及球场上的球员分开。对于卫星或车辆应用,通常需要找到某些区域,如道路、水体或各种地形。在本文中,我们提出了一种新的基于新设计的时空网络的实时目标区域分割方法。该方法在机器人硬件和操作环境定义的域约束下运行。该网络能够在有限的运行时间和计算能力的约束下实时运行。在模拟RoboCup SPL竞赛的Nao V6仿人机器人生成的数据集上,将该工作与其他实时分割方法进行了比较。在这种情况下,目标区域被定义为人工草地。该方法也在移动船只收集的海洋数据集上进行了测试,目的是将海洋区域与图像的其余部分分离。该数据集表明,所提出的模型可以推广到各种视觉问题。 摘要:Semantic segmentation has attracted a large amount of attention in recent years. In robotics, segmentation can be used to identify a region of interest, or emph{target area}. For example, in the RoboCup Standard Platform League (SPL), segmentation separates the soccer field from the background and from players on the field. For satellite or vehicle applications, it is often necessary to find certain regions such as roads, bodies of water or kinds of terrain. In this paper, we propose a novel approach to real-time target area segmentation based on a newly designed spatial temporal network. The method operates under domain constraints defined by both the robot's hardware and its operating environment . The proposed network is able to run in real-time, working within the constraints of limited run time and computing power. This work is compared against other real time segmentation methods on a dataset generated by a Nao V6 humanoid robot simulating the RoboCup SPL competition. In this case, the target area is defined as the artificial grass field. The method is also tested on a maritime dataset collected by a moving vessel, where the aim is to separate the ocean region from the rest of the image. This dataset demonstrates that the proposed model can generalise to a variety of vision problems.

【5】 Providing a Philosophical Critique and Guidance of Fairness Metrics 标题:提供公平度量学的哲学批判和指导 链接:https://arxiv.org/abs/2111.04417

作者:Henry Cerbone 备注:5 pages, ERS 2021 摘要:在这个项目中,我试图对计算机科学和哲学领域的公平主题进行总结和解包。这是由于对计算机科学中公平概念的日益依赖以及数千年来哲学领域对这一主题的思考。我希望这能成为日常计算机科学家,特别是机器人专家的$ extit{Fairity philosophy}$速成课程。本文将考虑当前最先进的思想在计算机科学中,特别是算法公平性,以及试图制定一套粗糙的准则公平性。在整个哲学讨论中,我们将回到辛西娅·德沃克提出的关于随机性问题的思想实验。 摘要:In this project, I seek to present a summarization and unpacking of themes of fairness both in the field of computer science and philosophy. This is motivated by an increased dependence on notions of fairness in computer science and the millennia of thought on the subject in the field of philosophy. It is my hope that this acts as a crash course in $ extit{fairness philosophy}$ for the everyday computer scientist and specifically roboticist. This paper will consider current state-of-the-art ideas in computer science, specifically algorithmic fairness, as well as attempt to lay out a rough set of guidelines for metric fairness. Throughout the discussion of philosophy, we will return to a thought experiment posed by Cynthia Dwork on the question of randomness.

【6】 GROWL: Group Detection With Link Prediction 标题:GROL:带链接预测的组检测 链接:https://arxiv.org/abs/2111.04397

作者:Viktor Schmuck,Oya Celiktutan 机构:Centre for Robotics Research, Department of Engineering, King’s College London, London, United Kingdom 摘要:交互群体的检测以前一直采用自下而上的方法,这种方法依赖于个体的位置和方向信息。这些方法主要基于成对亲和矩阵,仅限于静态的第三人称视图。这个问题可以从基于图形神经网络(GNN)的超越成对关系的整体方法中获益匪浅,这是因为形成交互群体的个体之间存在固有的空间结构。我们提出的链接预测群检测(GROWL)方法证明了基于GNN的方法的有效性。GROWL通过基于两个个体在图中的邻域生成特征嵌入来预测两个个体之间的联系,并确定它们是否与诸如多层感知器(MLP)之类的浅二元分类方法相连接。我们在第三人称视图数据集和机器人中心(即以自我为中心)数据集上测试了我们的方法与其他最先进的组检测方法。此外,我们还提出了一种基于RGB和深度数据的多模态方法来计算表示咆哮可以用作输入。我们的结果表明,基于GNN的方法可以显著提高不同摄像机视图(即第三人称视图和自我中心视图)的精度。 摘要:Interaction group detection has been previously addressed with bottom-up approaches which relied on the position and orientation information of individuals. These approaches were primarily based on pairwise affinity matrices and were limited to static, third-person views. This problem can greatly benefit from a holistic approach based on Graph Neural Networks (GNNs) beyond pairwise relationships, due to the inherent spatial configuration that exists between individuals who form interaction groups. Our proposed method, GROup detection With Link prediction (GROWL), demonstrates the effectiveness of a GNN based approach. GROWL predicts the link between two individuals by generating a feature embedding based on their neighbourhood in the graph and determines whether they are connected with a shallow binary classification method such as Multi-layer Perceptrons (MLPs). We test our method against other state-of-the-art group detection approaches on both a third-person view dataset and a robocentric (i.e., egocentric) dataset. In addition, we propose a multimodal approach based on RGB and depth data to calculate a representation GROWL can utilise as input. Our results show that a GNN based approach can significantly improve accuracy across different camera views, i.e., third-person and egocentric views.

【7】 Rethinking Deconvolution for 2D Human Pose Estimation Light yet Accurate Model for Real-time Edge Computing 标题:二维人体姿态估计反卷积的再思考实时边缘计算的轻量级精确模型 链接:https://arxiv.org/abs/2111.04226

作者:Masayuki Yamazaki,Eigo Mori 备注:IEEE International Conference on Automatic Face and Gesture Recognition 2021 摘要:在这项研究中,我们提出了一个实用的轻量级姿态估计模型。我们的模型可以使用低功耗嵌入式设备实现实时预测。该系统被发现非常准确,在COCO测试数据集上仅使用3.8%的计算成本,就达到了SOTA HRNet 256x192的94.5%的准确率。我们的模型采用了编码器-解码器结构,并仔细缩小以提高其效率。我们特别关注于优化反卷积层,并观察到反卷积层的信道缩减有助于在不降低系统精度的情况下显著减少计算资源消耗。我们还结合了最新的模型不可知技术,如暗色和蒸馏训练,以最大限度地提高我们模型的效率。此外,我们还应用模型量化来挖掘多/混合精度特征。我们的FP16’ed机型(COCO AP 70.0)在NVIDIA Jetson AGX Xavier上以约60 fps的速度运行,在NVIDIA Quadro RTX6000上以约200 fps的速度运行。 摘要:In this study, we present a pragmatic lightweight pose estimation model. Our model can achieve real-time predictions using low-power embedded devices. This system was found to be very accurate and achieved a 94.5% accuracy of SOTA HRNet 256x192 using a computational cost of only 3.8% on COCO test dataset. Our model adopts an encoder-decoder architecture and is carefully downsized to improve its efficiency. We especially focused on optimizing the deconvolution layers and observed that the channel reduction of the deconvolution layers contributes significantly to reducing computational resource consumption without degrading the accuracy of this system. We also incorporated recent model agnostic techniques such as DarkPose and distillation training to maximize the efficiency of our model. Furthermore, we applied model quantization to exploit multi/mixed precision features. Our FP16'ed model (COCO AP 70.0) operates at ~60-fps on NVIDIA Jetson AGX Xavier and ~200 fps on NVIDIA Quadro RTX6000.

【8】 Data-Efficient Deep Reinforcement Learning for Attitude Control of Fixed-Wing UAVs: Field Experiments 标题:数据高效的深度强化学习在固定翼无人机姿态控制中的应用 链接:https://arxiv.org/abs/2111.04153

作者:Eivind Bøhn,Erlend M. Coates,Dirk Reinhardt,Tor Arne Johansen 机构:Department of Mathematics and Cybernetics, SINTEF DIGITAL, Oslo, Norway, Centre for Autonomous Marine Operations and Systems, Department of Engineering Cybernetics, NTNU, Trondheim, Norway 备注:This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible 摘要:固定翼无人机(UAV)的姿态控制是一个困难的控制问题,部分原因是不确定的非线性动力学、执行器约束以及纵向和横向耦合运动。目前最先进的自动驾驶仪基于线性控制,因此其有效性和性能受到限制。深度强化学习(DRL)是一种通过与被控系统的交互自动发现最优控制律的机器学习方法,能够处理复杂的非线性动力学问题。我们在本文中表明,DRL可以成功地学习直接在原始非线性动力学上操作的固定翼无人机的姿态控制,只需要三分钟的飞行数据。我们首先在仿真环境中训练我们的模型,然后在无人机飞行试验中部署学习的控制器,在无需进一步在线学习的情况下,展示了与最先进的比例积分微分(PID)姿态控制器相当的性能。为了更好地理解学习控制器的操作,我们对其行为进行了分析,包括与现有良好调整的PID控制器的比较。 摘要:Attitude control of fixed-wing unmanned aerial vehicles (UAVs)is a difficult control problem in part due to uncertain nonlinear dynamics, actuator constraints, and coupled longitudinal and lateral motions. Current state-of-the-art autopilots are based on linear control and are thus limited in their effectiveness and performance. Deep reinforcement learning (DRL) is a machine learning method to automatically discover optimal control laws through interaction with the controlled system, that can handle complex nonlinear dynamics. We show in this paper that DRL can successfully learn to perform attitude control of a fixed-wing UAV operating directly on the original nonlinear dynamics, requiring as little as three minutes of flight data. We initially train our model in a simulation environment and then deploy the learned controller on the UAV in flight tests, demonstrating comparable performance to the state-of-the-art ArduPlaneproportional-integral-derivative (PID) attitude controller with no further online learning required. To better understand the operation of the learned controller we present an analysis of its behaviour, including a comparison to the existing well-tuned PID controller.

【9】 PID Controller Optimization for Low-cost Line Follower Robots 标题:低成本直线跟随式机器人PID控制器优化 链接:https://arxiv.org/abs/2111.04149

作者:Samet Oguten,Bilal Kabas 机构:∗Department of Electrical-Electronics Engineering, Abdullah G¨ul University, Kayseri, Turkey 备注:6 pages, 7 figures 摘要:本文讨论了改进传统PID控制器和开发开环控制机制以提高差动轮式机器人的稳定性和鲁棒性。为了部署该算法,已经使用低成本和现成的组件构建了一个测试平台,包括微控制器、反射传感器和电机驱动器。本文描述了用于识别系统规格和优化控制器的启发式方法。详细分析了PID控制器,并从稳定性的角度解释了各项的影响。最后,讨论了控制器和机器人开发过程中遇到的挑战。代码可从以下网址获取:https://github.com/sametoguten/STM32-Line-Follower-with-PID. 摘要:In this paper, modification of the classical PID controller and development of open-loop control mechanisms to improve stability and robustness of a differential wheeled robot are discussed. To deploy the algorithm, a test platform has been constructed using low-cost and off-the-shelf components including a microcontroller, reflectance sensor, and motor driver. This paper describes the heuristic approach used in the identification of the system specifications as well as the optimization of the controller. The PID controller is analyzed in detail and the effect of each term is explained in the context of stability. Lastly, the challenges encountered during the development of controller and robot are discussed. Code is available at: https://github.com/sametoguten/STM32-Line-Follower-with-PID.

【10】 Optimization of the Model Predictive Control Meta-Parameters Through Reinforcement Learning 标题:基于强化学习的模型预测控制元参数优化 链接:https://arxiv.org/abs/2111.04146

作者:Eivind Bøhn,Sebastien Gros,Signe Moe,Tor Arne Johansen 机构:Artificial Intelligence and Data Analytics, SINTEF Digital, Oslo, Norway, Department of Engineering Cybernetics, NTNU, Trondheim, Norway, Sopra Steria Applications, Oslo, Norway, Centre for Autonomous Marine Operations and Systems, NTNU, Trondheim, Norway 备注:This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible 摘要:模型预测控制(MPC)越来越多地被用于快速系统和嵌入式应用的控制。然而,MPC对此类系统有一些重大挑战。它的高计算复杂度导致了控制算法的高功耗,这在电池供电的嵌入式系统中占据了相当大的能源份额。必须调整MPC参数,这在很大程度上是一个反复试验的过程,在很大程度上影响控制器的控制性能、鲁棒性和计算复杂性。在本文中,我们提出了一个新的框架,其中任何控制算法的参数都可以使用强化学习(RL)进行联合调整,目的是同时优化控制算法的控制性能和功耗。我们提出了用RL优化MPC元参数的新思想,即影响MPC问题结构的参数,而不是给定问题的解决方案。我们的控制算法基于事件触发的MPC,在MPC中我们了解何时应该重新计算MPC,并且在MPC计算之间应用双模MPC和线性状态反馈控制律。我们提出了一种新的混合分布策略,并表明通过联合优化,我们可以实现在单独优化相同参数时不会出现的改进。我们在倒立摆控制任务上演示了我们的框架,将控制系统的总计算时间减少了36%,同时与性能最佳的MPC基线相比,控制性能也提高了18.4%。 摘要:Model predictive control (MPC) is increasingly being considered for control of fast systems and embedded applications. However, the MPC has some significant challenges for such systems. Its high computational complexity results in high power consumption from the control algorithm, which could account for a significant share of the energy resources in battery-powered embedded systems. The MPC parameters must be tuned, which is largely a trial-and-error process that affects the control performance, the robustness and the computational complexity of the controller to a high degree. In this paper, we propose a novel framework in which any parameter of the control algorithm can be jointly tuned using reinforcement learning(RL), with the goal of simultaneously optimizing the control performance and the power usage of the control algorithm. We propose the novel idea of optimizing the meta-parameters of MPCwith RL, i.e. parameters affecting the structure of the MPCproblem as opposed to the solution to a given problem. Our control algorithm is based on an event-triggered MPC where we learn when the MPC should be re-computed, and a dual mode MPC and linear state feedback control law applied in between MPC computations. We formulate a novel mixture-distribution policy and show that with joint optimization we achieve improvements that do not present themselves when optimizing the same parameters in isolation. We demonstrate our framework on the inverted pendulum control task, reducing the total computation time of the control system by 36% while also improving the control performance by 18.4% over the best-performing MPC baseline.

【11】 Automatic Goal Generation using Dynamical Distance Learning 标题:基于动态远程学习的目标自动生成 链接:https://arxiv.org/abs/2111.04120

作者:Bharat Prakash,Nicholas Waytowich,Tinoosh Mohsenin,Tim Oates 机构:University of Maryland, Baltimore County, Baltimore, MD USA, US Army Research Lab, Aberdeen, MD USA 摘要:强化学习(RL)代理可以通过与环境交互来学习解决复杂的顺序决策任务。然而,样本效率仍然是一个重大挑战。在多目标RL领域,agent需要达到多个目标来解决复杂的任务,提高样本效率尤其具有挑战性。另一方面,人类或其他生物制剂以一种更具战略性的方式学习此类任务,遵循一种课程,在该课程中,任务的抽样难度越来越大,以便取得渐进和有效的学习进展。在这项工作中,我们提出了一种基于动态距离函数(DDF)的自动目标生成方法。DDF是一个预测马尔可夫决策过程(MDP)中任意两个状态之间动态距离的函数。通过这一点,我们在适当的难度水平上制定了目标课程,以促进整个训练过程中的有效学习。我们在几个目标条件下的机器人操作和导航任务上评估了这种方法,并与仅使用随机目标抽样的基线方法相比,显示了样本效率的改进。 摘要:Reinforcement Learning (RL) agents can learn to solve complex sequential decision making tasks by interacting with the environment. However, sample efficiency remains a major challenge. In the field of multi-goal RL, where agents are required to reach multiple goals to solve complex tasks, improving sample efficiency can be especially challenging. On the other hand, humans or other biological agents learn such tasks in a much more strategic way, following a curriculum where tasks are sampled with increasing difficulty level in order to make gradual and efficient learning progress. In this work, we propose a method for automatic goal generation using a dynamical distance function (DDF) in a self-supervised fashion. DDF is a function which predicts the dynamical distance between any two states within a markov decision process (MDP). With this, we generate a curriculum of goals at the appropriate difficulty level to facilitate efficient learning throughout the training process. We evaluate this approach on several goal-conditioned robotic manipulation and navigation tasks, and show improvements in sample efficiency over a baseline method which only uses random goal sampling.

【12】 Ballistic Multibody Estimator for 2D Open Kinematic Chain 标题:二维开放运动链的弹道多体估计器 链接:https://arxiv.org/abs/2111.04118

作者:Thanacha Choopojcharoen,Worachit Ketrungsri,Thanapong Chuangyanyong,Panusorn Chinsakuljaroen 机构: 1All authors are associated with the Institute of Field Robotics, KingMongkut’s University of Technology Thonburi, 2Thanacha Choopojcharoen is currently an adjunct lecturer at the Instituteof Field Robotics, King Mongkut’s University of Technology Thonburi 备注:7 pages, 4 tables, 5 figures 摘要:自由飞行机器人的应用范围从娱乐用途到航空航天应用。这种系统的控制算法要求基于传感器反馈精确估计其状态。本文的目的是为自由飞行的开放式运动链设计并验证一种轻量级的状态估计算法,该算法可以估计其质心和姿态的状态。本研究不采用非线性动力学模型,而是提出了两个卡尔曼滤波器(KF)的级联结构,该结构依赖于自由落体多体系统弹道运动的信息以及惯性测量单元(IMU)和编码器的反馈。在Simulink模拟真实环境的仿真中验证了多种算法。结果表明,该估计器在跟踪性能和计算时间方面优于EKF和UKF。 摘要:Applications of free-flying robots range from entertainment purposes to aerospace applications. The control algorithm for such systems requires accurate estimation of their states based on sensor feedback. The objective of this paper is to design and verify a lightweight state estimation algorithm for a free-flying open kinematic chain that estimates the state of its center-of-mass and its posture. Instead of utilizing a nonlinear dynamics model, this research proposes a cascade structure of two Kalman filters (KF), which relies on the information from the ballistic motion of free-falling multibody systems together with feedback from an inertial measurement unit (IMU) and encoders. Multiple algorithms are verified in the simulation that mimics real-world circumstances with Simulink. Several uncertain physical parameters are varied, and the result shows that the proposed estimator outperforms EKF and UKF in terms of tracking performance and computational time.

【13】 Hierarchical Segment-based Optimization for SLAM 标题:基于分层分段的SLAM算法优化 链接:https://arxiv.org/abs/2111.04101

作者:Yuxin Tian,Yujie Wang,Ming Ouyang,Xuesong Shi 备注:IROS 2021 摘要:提出了一种基于分层分段的同步定位与映射(SLAM)系统优化方法。首先,我们提出了一种可靠的轨迹分割方法,可以用来提高后端优化的效率。然后,我们首次提出了一种缓冲机制来提高分割的鲁棒性。在优化过程中,我们利用全局信息对误差较大的帧进行优化,用插值代替优化来更新估计良好的帧,从而根据每帧的误差分层分配计算量。在基准测试上的对比实验表明,我们的方法在几乎不降低精度的情况下大大提高了优化效率,并且大大优于现有的高效优化方法。 摘要:This paper presents a hierarchical segment-based optimization method for Simultaneous Localization and Mapping (SLAM) system. First we propose a reliable trajectory segmentation method that can be used to increase efficiency in the back-end optimization. Then we propose a buffer mechanism for the first time to improve the robustness of the segmentation. During the optimization, we use global information to optimize the frames with large error, and interpolation instead of optimization to update well-estimated frames to hierarchically allocate the amount of computation according to error of each frame. Comparative experiments on the benchmark show that our method greatly improves the efficiency of optimization with almost no drop in accuracy, and outperforms existing high-efficiency optimization method by a large margin.

【14】 Online Adaptation of Monocular Depth Prediction with Visual SLAM 标题:单目深度预测的视觉SLAM在线自适应 链接:https://arxiv.org/abs/2111.04096

作者:Shing Yan Loo,Moein Shakeri,Sai Hong Tang,Syamsiah Mashohor,Hong Zhang 机构:• We propose a novel online adaptation framework tofine-tune monocular depth prediction on-demand and 1The authors are with Department of Computing Science, University ofAlberta, Universiti Putra Malaysia 备注:9 pages, 8 figures 摘要:CNN准确预测深度的能力是一个重大挑战,因为它在实际视觉SLAM应用中的广泛应用,例如增强的摄像机跟踪和密集贴图。本文旨在回答以下问题:即使CNN未针对当前操作环境进行训练,我们是否可以借助视觉SLAM算法调整深度预测CNN,以提高SLAM性能?为此,我们提出了一种新的在线自适应框架,该框架由两个互补过程组成:一个用于生成关键帧以微调深度预测的SLAM算法和另一个使用在线自适应深度来提高地图质量的算法。一旦去除了潜在的噪声贴图点,我们将执行全局光度束调整(BA),以提高整体SLAM性能。在基准数据集和我们自己的实验环境中的真实机器人上的实验结果表明,我们提出的方法提高了SLAM重建的精度。我们证明了正则化在训练损失中的应用是防止灾难性遗忘的有效手段。此外,我们将我们的在线自适应框架与最先进的预训练深度预测CNN进行比较,以表明我们的在线自适应深度预测CNN优于在大量数据集上训练的深度预测CNN。 摘要:The ability of accurate depth prediction by a CNN is a major challenge for its wide use in practical visual SLAM applications, such as enhanced camera tracking and dense mapping. This paper is set out to answer the following question: Can we tune a depth prediction CNN with the help of a visual SLAM algorithm even if the CNN is not trained for the current operating environment in order to benefit the SLAM performance? To this end, we propose a novel online adaptation framework consisting of two complementary processes: a SLAM algorithm that is used to generate keyframes to fine-tune the depth prediction and another algorithm that uses the online adapted depth to improve map quality. Once the potential noisy map points are removed, we perform global photometric bundle adjustment (BA) to improve the overall SLAM performance. Experimental results on both benchmark datasets and a real robot in our own experimental environments show that our proposed method improves the SLAM reconstruction accuracy. We demonstrate the use of regularization in the training loss as an effective means to prevent catastrophic forgetting. In addition, we compare our online adaptation framework against the state-of-the-art pre-trained depth prediction CNNs to show that our online adapted depth prediction CNN outperforms the depth prediction CNNs that have been trained on a large collection of datasets.

【15】 GSG: A Granary Soft Gripper with Mechanical Force Sensing via 3-Dimensional Snap-Through Structure 标题:GSG:一种基于三维穿透结构的机械力传感粮仓软抓取器 链接:https://arxiv.org/abs/2111.04046

作者:Huixu Dong,Chao-Yu Chen,Chen Qiu,Chen-Hua Yeow,Haoyong Yu 机构:G 摘要:抓取是大多数机器人在实际应用中必不可少的能力。软机器人手爪被认为是机器人抓取的重要组成部分,由于其高柔顺性和对物体几何变化的鲁棒性等优点,已经引起了人们的广泛关注;然而,它们仍然受到相应传感能力和驱动机制的限制。我们提出了一种新型的软夹持器,它看起来像一个粮仓,采用集成模具技术制造了一个柔顺的直通式双稳态机构,完全通过机械方式实现传感和驱动。特别是,由于抓取和传感行为是完全被动的,因此,该夹具中的直通式双稳态结构允许我们降低机构、控制和传感设计的复杂性。一旦抓取器的触发位置接触到物体并施加足够的力,抓取行为就会自动被激发。为了抓取具有不同外形的物体,建议的粮仓软夹持器(GSG)设计为能够封装、夹持和锁定抓取器。手爪由一个手掌、一个手掌帽和三个手指组成。首先,对夹持器的设计进行了分析。然后,在建立理论模型后,进行有限元(FE)仿真以验证所建立的模型。最后,进行了一系列抓取实验,以评估所提出的抓取器在抓取和传感方面的快速通过行为。实验结果表明,该夹持器能够操纵各种软、硬物体,即使受到外界干扰,仍能保持稳定。 摘要:Grasping is an essential capability for most robots in practical applications. Soft robotic grippers are considered as a critical part of robotic grasping and have attracted considerable attention in terms of the advantages of the high compliance and robustness to variance in object geometry; however, they are still limited by the corresponding sensing capabilities and actuation mechanisms. We propose a novel soft gripper that looks like a granary with a compliant snap-through bistable mechanism fabricated by integrated mold technology, achieving sensing and actuation purely mechanically. In particular, the snap-through bistable structure in the proposed gripper allows us to reduce the complexity of the mechanism, control, sensing designs since the grasping and sensing behaviors are completely passive. The grasping behaviors are automatically motivated once the trigger position of the gripper touches an object and applies sufficient force. To grasp objects with various profiles, the proposed granary soft gripper (GSG) is designed to be capable of enveloping, pinching and caging grasps. The gripper consists of a chamber palm, a palm cap and three fingers. First, the design of the gripper is analyzed. Then, after the theoretical model is constructed, finite element (FE) simulations are conducted to verify the built model. Finally, a series of grasping experiments is carried out to assess the snap-through behavior of the proposed gripper on grasping and sensing. The experimental results illustrate that the proposed gripper can manipulate a variety of soft and rigid objects and remain stable even though it undertakes external disturbances.

【16】 V-MAO: Generative Modeling for Multi-Arm Manipulation of Articulated Objects 标题:V-MAO:关节物体多臂操作的产生式建模 链接:https://arxiv.org/abs/2111.03987

作者:Xingyu Liu,Kris M. Kitani 机构:Robotics Institute, Carnegie Mellon University 备注:CoRL 2021 摘要:操纵铰接对象通常需要多个机器人手臂。让多个机器人手臂协同完成关节对象上的操纵任务是一项挑战。在本文中,我们提出了$ extbf{V-MAO}$,一个学习关节对象多臂操作的框架。我们的框架包括一个变分生成模型,学习每个机器人手臂在物体刚性部分上的接触点分布。训练信号是通过与仿真环境的交互获得的,仿真环境通过规划和关节对象的以对象为中心的控制的新形式实现。我们在定制的MuJoCo仿真环境中部署了我们的框架,并证明我们的框架在六个不同的对象和两个不同的机器人上实现了较高的成功率。我们还表明,生成建模可以有效地学习关节对象上的接触点分布。 摘要:Manipulating articulated objects requires multiple robot arms in general. It is challenging to enable multiple robot arms to collaboratively complete manipulation tasks on articulated objects. In this paper, we present $ extbf{V-MAO}$, a framework for learning multi-arm manipulation of articulated objects. Our framework includes a variational generative model that learns contact point distribution over object rigid parts for each robot arm. The training signal is obtained from interaction with the simulation environment which is enabled by planning and a novel formulation of object-centric control for articulated objects. We deploy our framework in a customized MuJoCo simulation environment and demonstrate that our framework achieves a high success rate on six different objects and two different robots. We also show that generative modeling can effectively learn the contact point distribution on articulated objects.

【17】 A Virtual Reality Simulation Pipeline for Online Mental Workload Modeling 标题:一种用于在线脑力负荷建模的虚拟现实仿真流水线 链接:https://arxiv.org/abs/2111.03977

作者:Robert L. Wilson,Daniel Browne,Jonathan Wagstaff,Steve McGuire 机构: 1Department of Electrical and Computer Engineering 备注:7 pages, 4 figures, and 1 table Currently under review as a conference paper for IEEE VR 2022 摘要:无缝人机交互(HRI)和人机合作(HR)团队协作在很大程度上依赖于准确及时的人类心理负荷(MW)模型。认知负荷理论(CLT)认为典型的物理环境产生典型的心理过程;物理环境保真度与改进的建模精度相对应。虚拟现实(VR)系统提供沉浸式环境,能够复制复杂场景,特别是与高风险、高压力场景相关的场景。被动生物信号建模是一种无创的微波建模方法。然而,虚拟现实系统很少包括多模态心理生理反馈或利用生物信号数据进行在线建模。在这里,我们开发了一种新的虚拟现实模拟管道,其灵感来自美国宇航局多属性任务电池II(MATB-II)任务体系结构,能够在模拟的危险探测环境中同步收集客观性能、主观性能和被动人体生物信号。我们的系统设计通过机器人操作系统(ROS)提取和发布生物特征,促进基于心理生理学的实时MW模型集成到完整的端到端系统中。一个能够在线评估MWs的虚拟现实仿真管道可以使这些系统能够根据操作员MW自适应地改变其行为,从而为提升人力资源系统和虚拟现实体验奠定基础。 摘要:Seamless human robot interaction (HRI) and cooperative human-robot (HR) teaming critically rely upon accurate and timely human mental workload (MW) models. Cognitive Load Theory (CLT) suggests representative physical environments produce representative mental processes; physical environment fidelity corresponds with improved modeling accuracy. Virtual Reality (VR) systems provide immersive environments capable of replicating complicated scenarios, particularly those associated with high-risk, high-stress scenarios. Passive biosignal modeling shows promise as a noninvasive method of MW modeling. However, VR systems rarely include multimodal psychophysiological feedback or capitalize on biosignal data for online MW modeling. Here, we develop a novel VR simulation pipeline, inspired by the NASA Multi-Attribute Task Battery II (MATB-II) task architecture, capable of synchronous collection of objective performance, subjective performance, and passive human biosignals in a simulated hazardous exploration environment. Our system design extracts and publishes biofeatures through the Robot Operating System (ROS), facilitating real time psychophysiology-based MW model integration into complete end-to-end systems. A VR simulation pipeline capable of evaluating MWs online could be foundational for advancing HR systems and VR experiences by enabling these systems to adaptively alter their behaviors in response to operator MW.

【18】 Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods 标题:策略梯度方法的时间离散化-不变安全动作重复 链接:https://arxiv.org/abs/2111.03941

作者:Seohong Park,Jaekyeom Kim,Gunhee Kim 机构:Seoul National University 摘要:在强化学习中,连续时间通常由时间尺度$delta$离散,由此产生的性能对其高度敏感。在这项工作中,我们寻求为策略梯度(PG)方法找到一个$delta$不变的算法,该算法无论$delta$的值是多少,都能很好地执行。我们首先确定了导致PG方法失败为$delta到0$的根本原因,证明了在一定的随机性假设下,PG估计量的方差在随机环境中可以发散到无穷大。虽然可以使用持续动作或动作重复来保持$delta$-不变性,但以前的动作重复方法无法立即对随机环境中的意外情况做出反应。因此,我们提出了一种新的$delta$不变方法,称为安全动作重复(SAR),适用于任何现有的PG算法。SAR可以通过对动作重复期间的状态变化作出自适应反应来处理环境的随机性。我们的经验表明,我们的方法不仅是$delta$不变的,而且对随机性也具有鲁棒性,在八个具有确定性和随机设置的MuJoCo环境中,优于以前的$delta$不变方法。我们的代码可在https://vision.snu.ac.kr/projects/sar. 摘要:In reinforcement learning, continuous time is often discretized by a time scale $delta$, to which the resulting performance is known to be highly sensitive. In this work, we seek to find a $delta$-invariant algorithm for policy gradient (PG) methods, which performs well regardless of the value of $delta$. We first identify the underlying reasons that cause PG methods to fail as $delta o 0$, proving that the variance of the PG estimator can diverge to infinity in stochastic environments under a certain assumption of stochasticity. While durative actions or action repetition can be employed to have $delta$-invariance, previous action repetition methods cannot immediately react to unexpected situations in stochastic environments. We thus propose a novel $delta$-invariant method named Safe Action Repetition (SAR) applicable to any existing PG algorithm. SAR can handle the stochasticity of environments by adaptively reacting to changes in states during action repetition. We empirically show that our method is not only $delta$-invariant but also robust to stochasticity, outperforming previous $delta$-invariant approaches on eight MuJoCo environments with both deterministic and stochastic settings. Our code is available at https://vision.snu.ac.kr/projects/sar.

【19】 Feedback Control of Millimeter Scale Pivot Walkers Using Magnetic Actuation 标题:基于磁力驱动的毫米级枢轴步行器反馈控制 链接:https://arxiv.org/abs/2111.03934

作者:Ehab Al Khatib,Pouria Razzaghi,Yildirim Hurmuzlu 机构: and Yildirim Hurmuzlu are fromthe Department of Mechanical Engineering, Southern Methodist University 摘要:外部磁场可用于远程控制小型机器人,使其成为多种生物医学和工程应用的候选。我们展示了我们的磁驱动微型机器人是高度灵活的,可以执行各种机车任务,如在水平面上旋转行走和翻滚。在这里,我们着重于控制这个微型机器人在枢轴行走模式下的运动结果。建立了系统的数学模型,推导了运动学模型。还研究了扫掠角和倾斜角在机器人运动中的作用。我们提出了两个控制器来调节步行者的步态。第一个是比例几何控制器,用于确定微型机器人应使用的正确枢轴点。然后,它根据微机器人中心与参考轨迹之间的误差按比例调节角速度。第二个控制器基于梯度下降优化技术,将控制行为表示为优化问题。这些控制算法使微型机器人能够在跟踪所需轨迹的同时产生稳定的步态。我们进行了一系列不同的实验和仿真运行,以确定所提出的控制器在跟踪误差方面对不同扫描和倾斜角度的有效性。这两个控制器表现出适当的性能,但可以观察到,基于梯度下降的控制器具有更快的收敛时间、更小的跟踪误差和更少的步数。最后,我们对扫描角、倾斜角和步进时间对跟踪误差的影响进行了广泛的实验参数分析。正如我们所期望的,基于优化的控制器优于基于几何的控制器。 摘要:An external magnetic field can be used to remotely control small-scaled robots, making them promising candidates for diverse biomedical and engineering applications. We showed that our magnetically actuated millirobot is highly agile and can perform a variety of locomotive tasks such as pivot walking and tumbling in a horizontal plane. Here, we focus on controlling the locomotion outcomes of this millirobot in the pivot walking mode. A mathematical model of the system is developed and the kinematic model is derived. The role of the sweep and tilt angles in the robot's motion is also investigated. We propose two controllers to regulate the gait of the pivot walker. The first one is a proportional-geometric controller, which determines the correct pivot point that the millirobot should use. Then, it regulates the angular velocity proportionally based on the error between the center of the millirobot and the reference trajectory. The second controller is based on a gradient descent optimization technique, which expresses the control action as an optimization problem. These control algorithms enable the millirobot to generate a stable gait while tracking the desired trajectory. We conduct a set of different experiments and simulation runs to establish the effectiveness of proposed controllers for different sweep and tilt angles in terms of the tracking error. The two controllers exhibit an appropriate performance, but it is observed that gradient descent based controller yields faster convergence time, smaller tracking error, and fewer number of steps. Finally, we perform an extensive experimentally parametric analysis on the effect of the sweep angle, tilt angle, and step time on the tracking error. As we expect, the optimization-based controller outperforms the geometric based controller.

【20】 Swarm Control of Magnetically Actuated Millirobots 标题:磁驱动微机器人的群体控制 链接:https://arxiv.org/abs/2111.03931

作者:Pouria Razzaghi,Ehab Al Khatib,Yildirim Hurmuzlu 机构: Southern Methodist University 摘要:小型机器人可以进入大型机器人无法进入的空间。这种访问方式在药物输送、环境检测和小样本采集等应用中至关重要。然而,有些任务不可能只用一个机器人来完成,包括小规模的组装和制造、微纳米物体的操纵以及基于机器人的小规模材料构造。这个问题的解决办法是使用一组机器人作为一个系统。因此,我们专注于使用一组小型机器人可以完成的任务。由于尺寸限制,这些机器人通常由外部驱动。然而,我们面临着使用单一全局输入控制一组机器人的挑战。我们提出了一种控制算法,将群体中的个体成员定位在预定义的位置。单个控制输入应用于系统,并将所有机器人移动到同一方向。我们还通过使用不同长度的机器人添加了另一种控制方式。电磁线圈系统施加外力并操纵微型机器人。这种微型机器人可以以各种运动模式移动,如旋转行走和翻滚。我们提出了两种新的微型机器人设计方案。在第一种设计中,磁铁被放置在身体的中心,以减少磁吸引力。在第二种设计中,微型机器人具有相同的长度,两条额外的腿充当枢轴点。通过这种方式,我们在设计中改变枢轴分离,以利用枢轴行走模式中的变速,同时在翻滚模式中保持速度恒定。本文提出了n个不同长度的微型机器人从给定初始位置移动到最终期望位置的通用位置控制算法。该方法基于选择完全可控的引线。仿真和硬件实验验证了这些结果。 摘要:Small-size robots offer access to spaces that are inaccessible to larger ones. This type of access is crucial in applications such as drug delivery, environmental detection, and collection of small samples. However, there are some tasks that are not possible to perform using only one robot including assembly and manufacturing at small scales, manipulation of micro- and nano- objects, and robot-based structuring of small-scale materials. The solution to this problem is to use a group of robots as a system. Thus, we focus on tasks that can be achieved using a group of small-scale robots. These robots are typically externally actuated due to their size limitation. Yet, one faces the challenge of controlling a group of robots using a single global input. We propose a control algorithm to position individual members of a swarm in predefined positions. A single control input applies to the system and moves all robots in the same direction. We also add another control modality by using different length robots. An electromagnetic coil system applied external force and steered the millirobots. This millirobot can move in various modes of motion such as pivot walking and tumbling. We propose two new designs of these millirobots. In the first design, the magnets are placed at the center of body to reduce the magnetic attraction force. In the second design, the millirobots are of identical length with two extra legs acting as the pivot points. This way we vary pivot separation in design to take advantage of variable speed in pivot walking mode while keeping the speed constant in tumbling mode. This paper presents a general algorithm for positional control of n millirobots with different lengths to move them from given initial positions to final desired ones. This method is based on choosing a leader that is fully controllable. Simulations and hardware experiments validate these results.

【21】 Robust Deep Reinforcement Learning for Quadcopter Control 标题:用于四轴飞行器控制的鲁棒深度强化学习 链接:https://arxiv.org/abs/2111.03915

作者:Aditya M. Deshpande,Ali A. Minai,Manish Kumar 机构:University of Cincinnati, Clifton Ave., Cincinnati, Ohio 备注:6 pages; 3 Figures; Accepted in this https URL 摘要:深度强化学习(RL)使得使用神经网络作为函数逼近器来解决复杂的机器人问题成为可能。然而,在静态环境中训练的策略在从一个环境转移到另一个环境时会受到泛化的影响。在这项工作中,我们使用鲁棒马尔可夫决策过程(RMDP)来训练无人机控制策略,它结合了鲁棒控制和RL的思想。它选择悲观优化来处理从一个环境到另一个环境的策略转移之间的潜在差距。训练后的控制策略在四旋翼机位置控制任务上进行了测试。RL特工在MuJoCo模拟器中接受训练。在测试过程中,使用不同的环境参数(在训练过程中看不到)来验证训练策略从一个环境转移到另一个环境的鲁棒性。在这些环境中,鲁棒策略的性能优于标准代理,这表明增加的鲁棒性增加了通用性,并且可以适应非平稳环境。代码:https://github.com/adipandas/gym_multirotor 摘要:Deep reinforcement learning (RL) has made it possible to solve complex robotics problems using neural networks as function approximators. However, the policies trained on stationary environments suffer in terms of generalization when transferred from one environment to another. In this work, we use Robust Markov Decision Processes (RMDP) to train the drone control policy, which combines ideas from Robust Control and RL. It opts for pessimistic optimization to handle potential gaps between policy transfer from one environment to another. The trained control policy is tested on the task of quadcopter positional control. RL agents were trained in a MuJoCo simulator. During testing, different environment parameters (unseen during the training) were used to validate the robustness of the trained policy for transfer from one environment to another. The robust policy outperformed the standard agents in these environments, suggesting that the added robustness increases generality and can adapt to non-stationary environments. Codes: https://github.com/adipandas/gym_multirotor

【22】 Flying Trapeze Act Motion Planning Algorithm for Two-Link Free-Flying Acrobatic Robot 标题:两连杆自由飞行杂技机器人飞人动作运动规划算法 链接:https://arxiv.org/abs/2111.03823

作者:Thanapong Chuangyanyong,Panusorn Chinsakuljaroen,Worachit Ketrungsri,Thanacha Choopojcharoen 机构: At the end of one of the linkslies a gripper module used to clutch the trapeze and land 1All authors are associated with the Institute of Field Robotics, KingMongkut’s University of Technology Thonburi 备注:6 pages, 8 figures 摘要:空中飞人动作对于机器人系统来说是一项具有挑战性的任务,因为某些动作要求表演者在空中飞行后在最后接住另一个空中飞人或接球手。本文的目的是设计并验证一种双连杆自由飞行杂技机器人的运动规划算法,该机器人在空中自由飞行后能够准确地降落在另一个吊架上。首先,该算法采用非线性约束优化方法规划机器人轨迹。然后,采用反馈控制器来稳定姿态。然而,由于机器人质心的空间位置无法控制,本文提出了一种轨迹修正方案,该方案可以控制机器人的姿态,使机器人仍然能够降落在目标上。最后,在模拟真实环境的仿真中对整个算法进行了验证。 摘要:A flying trapeze act can be a challenging task for a robotics system since some act requires the performer to catch another trapeze or catcher at the end after being airborne. The objective of this paper is to design and validate a motion planning algorithm for a two-link free-flying acrobatic robot that can accurately land on another trapeze after free-flying in the air. First, the proposed algorithm plan the robot trajectory with the non-linear constrained optimization method. Then, a feedback controller is implemented to stabilize the posture. However, since the spatial position of the center-of-mass of the robot cannot be controlled, this paper proposes a trajectory correction scheme that manipulates the robot's posture such that the robot is still able to land on the target. Lastly, the whole algorithm is validated in the simulation that mimics real-world circumstances.

【23】 Prediction of Pedestrian Spatiotemporal Risk Levels for Intelligent Vehicles: A Data-driven Approach 标题:基于数据驱动的智能车辆行人时空风险水平预测 链接:https://arxiv.org/abs/2111.03822

作者:Zheyu Zhang,Boyang Wang,Chao Lu,Jinghang Li,Cheng Gong,Jianwei Gong 机构:School of Mechanical Engineering, Beijing Institute of Technology, Beijing , China 摘要:近年来,道路安全引起了智能交通系统领域研究人员和实践者的极大关注。行人作为道路使用者中最常见和最脆弱的群体之一,由于其不可预测的行为和运动而引起人们的极大关注,因为车-人交互中的细微误解很容易导致危险情况或碰撞。现有的方法要么使用预定义的基于碰撞的模型,要么使用人类标记的方法来估计行人的风险。这些方法通常受到泛化能力差以及缺乏对自我车辆和行人之间交互作用的考虑的限制。这项工作通过提出行人风险等级预测系统来解决列出的问题。该系统由三个模块组成。首先,收集车辆透视行人数据。由于数据包含关于自我车辆和行人运动的信息,因此可以以交互感知的方式简化时空特征的预测。行人轨迹预测模块利用长-短期记忆模型,在随后的五帧中预测行人的时空特征。由于预测轨迹遵循一定的交互和风险模式,因此采用混合聚类和分类方法来探索时空特征中的风险模式,并利用学习到的模式训练风险等级分类器。通过预测行人的时空特征并确定相应的风险水平,确定自我车辆和行人之间的风险模式。实验结果验证了PRLP系统预测行人风险水平的能力,从而支持智能车辆碰撞风险评估,并为车辆和行人提供安全警告。 摘要:In recent years, road safety has attracted significant attention from researchers and practitioners in the intelligent transport systems domain. As one of the most common and vulnerable groups of road users, pedestrians cause great concerns due to their unpredictable behavior and movement, as subtle misunderstandings in vehicle-pedestrian interaction can easily lead to risky situations or collisions. Existing methods use either predefined collision-based models or human-labeling approaches to estimate the pedestrians' risks. These approaches are usually limited by their poor generalization ability and lack of consideration of interactions between the ego vehicle and a pedestrian. This work tackles the listed problems by proposing a Pedestrian Risk Level Prediction system. The system consists of three modules. Firstly, vehicle-perspective pedestrian data are collected. Since the data contains information regarding the movement of both the ego vehicle and pedestrian, it can simplify the prediction of spatiotemporal features in an interaction-aware fashion. Using the long short-term memory model, the pedestrian trajectory prediction module predicts their spatiotemporal features in the subsequent five frames. As the predicted trajectory follows certain interaction and risk patterns, a hybrid clustering and classification method is adopted to explore the risk patterns in the spatiotemporal features and train a risk level classifier using the learned patterns. Upon predicting the spatiotemporal features of pedestrians and identifying the corresponding risk level, the risk patterns between the ego vehicle and pedestrians are determined. Experimental results verified the capability of the PRLP system to predict the risk level of pedestrians, thus supporting the collision risk assessment of intelligent vehicles and providing safety warnings to both vehicles and pedestrians.

【24】 ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking 标题:ROFT:实时光流辅助的6D目标位姿和速度跟踪 链接:https://arxiv.org/abs/2111.03821

作者:Nicola A. Piga,Yuriy Onyshchuk,Giulia Pasquale,Ugo Pattacini,Lorenzo Natale 机构: Piga is also with Universita di Genova 备注:None 摘要:6D物体姿态跟踪在机器人和计算机视觉领域得到了广泛的研究。最有希望的解决方案,利用深度神经网络和/或过滤和优化,在标准基准上表现出显著的性能。然而,据我们所知,这些还没有经过针对快速物体运动的彻底测试。在这种情况下,跟踪性能会显著下降,尤其是对于无法实现实时性能并引入不可忽略延迟的方法。在这项工作中,我们介绍了ROFT,一种从RGB-D图像流中进行6D目标姿态和速度跟踪的卡尔曼滤波方法。通过利用实时光流,ROFT将低帧率卷积神经网络(例如分割和6D对象姿态估计)的延迟输出与RGB-D输入流同步,以实现快速、精确的6D对象姿态和速度跟踪。我们在一个新引入的真实照片数据集Fast YCB上测试了我们的方法,该数据集包含来自YCB模型集的快速移动对象,并在用于对象和手姿势估计HO-3D的数据集上测试了我们的方法。结果表明,我们的方法在提供6D目标速度跟踪的同时,优于最先进的6D目标姿态跟踪方法。作为补充材料,提供了显示实验的视频。 摘要:6D object pose tracking has been extensively studied in the robotics and computer vision communities. The most promising solutions, leveraging on deep neural networks and/or filtering and optimization, exhibit notable performance on standard benchmarks. However, to our best knowledge, these have not been tested thoroughly against fast object motions. Tracking performance in this scenario degrades significantly, especially for methods that do not achieve real-time performance and introduce non negligible delays. In this work, we introduce ROFT, a Kalman filtering approach for 6D object pose and velocity tracking from a stream of RGB-D images. By leveraging real-time optical flow, ROFT synchronizes delayed outputs of low frame rate Convolutional Neural Networks for instance segmentation and 6D object pose estimation with the RGB-D input stream to achieve fast and precise 6D object pose and velocity tracking. We test our method on a newly introduced photorealistic dataset, Fast-YCB, which comprises fast moving objects from the YCB model set, and on the dataset for object and hand pose estimation HO-3D. Results demonstrate that our approach outperforms state-of-the-art methods for 6D object pose tracking, while also providing 6D object velocity tracking. A video showing the experiments is provided as supplementary material.

【25】 Roofline Model for UAVs:A Bottleneck Analysis Tool for Designing Compute Systems for Autonomous Drones 标题:无人机顶线模型:自主无人机计算系统设计的瓶颈分析工具 链接:https://arxiv.org/abs/2111.03792

作者:Srivatsan Krishnan Zishen Wan,Kshitij Bhardwaj,Aleksandra Faust,Vijay Janapa Reddi 机构:†Harvard University, ∓LLNL, §Google Brain Research 摘要:我们提出了一个瓶颈分析工具,用于设计自主无人机(UAV)的计算系统。该工具通过利用自主式无人机中各种组件之间的基本关系(如传感器、计算机、车身动力学)提供见解。为了保证安全运行,同时最大限度地提高无人机的性能(如速度),必须仔细设计(或选择)计算机、传感器和其他机械性能。我们提出的工具的目标是提供一个可视化模型,帮助系统架构师理解自治无人机的最佳计算设计(或选择)。此工具在以下位置可用:~url{https://bit.ly/skyline-tool} 摘要:We present a bottleneck analysis tool for designing compute systems for autonomous Unmanned Aerial Vehicles (UAV). The tool provides insights by exploiting the fundamental relationships between various components in the autonomous UAV such as sensor, compute, body dynamics. To guarantee safe operation while maximizing the performance (e.g., velocity) of the UAV, the compute, sensor, and other mechanical properties must be carefully designed (or selected). The goal of our proposed tool is to provide a visual model which aids system architects to understand optimal compute design (or selection) for autonomous UAVs. The tool is available here:~url{https://bit.ly/skyline-tool}

【26】 Asynchronous Collaborative Localization by Integrating Spatiotemporal Graph Learning with Model-Based Estimation 标题:时空图学习与基于模型估计的异步协同定位 链接:https://arxiv.org/abs/2111.03751

作者:Peng Gao,Brian Reily,Rui Guo,Hongsheng Lu,Qingzhao Zhu,Hao Zhang 摘要:协作定位是一组机器人(如连接的车辆)从多个角度协作估计目标位置并进行可靠协作的基本能力。为了实现协作定位,必须解决四个关键挑战,包括建模观察对象之间的复杂关系、融合来自任意数量协作机器人的观察结果、量化定位不确定性以及解决机器人通信延迟问题。在本文中,我们介绍了一种新的方法,该方法集成了不确定性感知时空图学习和基于模型的状态估计,用于一组机器人协作定位对象。具体而言,我们引入了一种新的不确定性感知图学习模型,该模型学习时空图来表示每个机器人随时间观察到的对象的历史运动,并提供对象定位中的不确定性。此外,我们还提出了一种新的集成学习和基于模型的状态估计方法,该方法融合了从任意数量的机器人获得的异步观测数据,用于协作定位。我们在仿真和真实机器人的两个协作对象定位场景中评估了我们的方法。实验结果表明,我们的方法在异步协作定位方面优于以往的方法,并取得了最新的性能。 摘要:Collaborative localization is an essential capability for a team of robots such as connected vehicles to collaboratively estimate object locations from multiple perspectives with reliant cooperation. To enable collaborative localization, four key challenges must be addressed, including modeling complex relationships between observed objects, fusing observations from an arbitrary number of collaborating robots, quantifying localization uncertainty, and addressing latency of robot communications. In this paper, we introduce a novel approach that integrates uncertainty-aware spatiotemporal graph learning and model-based state estimation for a team of robots to collaboratively localize objects. Specifically, we introduce a new uncertainty-aware graph learning model that learns spatiotemporal graphs to represent historical motions of the objects observed by each robot over time and provides uncertainties in object localization. Moreover, we propose a novel method for integrated learning and model-based state estimation, which fuses asynchronous observations obtained from an arbitrary number of robots for collaborative localization. We evaluate our approach in two collaborative object localization scenarios in simulations and on real robots. Experimental results show that our approach outperforms previous methods and achieves state-of-the-art performance on asynchronous collaborative localization.

【27】 Quadrupedal Robotic Guide Dog with Vocal Human-Robot Interaction 标题:具有语音人-机器人交互的四足机器人导盲犬 链接:https://arxiv.org/abs/2111.03718

作者:Kavan Mehrizi,Zhongyu Li,Koushil Sreenath 机构:Department of Computer Science, Diablo Valley College, Pleasant Hill, CA, USA, Department of Mechanical Engineering, University of California, Berkeley, Berkeley, CA, USA 备注:Hopper Dean & NSF REU: Transfer-to-Excellence Research Experiences for Undergraduates (TTE REU), University of California, Berkeley 摘要:导盲犬在许多人的生活中起着至关重要的作用,然而训练导盲犬是一个耗时费力的过程。我们正在开发一种方法,允许自主机器人使用直接的人机通信来物理指导人类。所提出的算法将部署在Unitree A1四足机器人上,在使用与机器人兼容的语音接口与人通信时,将自动导航人到其目的地。该语音接口利用基于云的服务(如Amazon Polly和Google cloud)作为文本到语音和语音到文本引擎。 摘要:Guide dogs play a critical role in the lives of many, however training them is a time- and labor-intensive process. We are developing a method to allow an autonomous robot to physically guide humans using direct human-robot communication. The proposed algorithm will be deployed on a Unitree A1 quadrupedal robot and will autonomously navigate the person to their destination while communicating with the person using a speech interface compatible with the robot. This speech interface utilizes cloud based services such as Amazon Polly and Google Cloud to serve as the text-to-speech and speech-to-text engines.

【28】 Using Monocular Vision and Human Body Priors for AUVs to Autonomously Approach Divers 标题:基于单目视觉和人体先验的AUV自主接近潜水器 链接:https://arxiv.org/abs/2111.03712

作者:Michael Fulton,Jungseok Hong,Junaed Sattar 备注:14 pages, under review for ICRA22-RAL 摘要:人类与自主水下机器人(AUV)之间的直接通信在人类-机器人交互(HRI)研究中是一个相对未被充分探索的领域,尽管许多任务(如监视、检查和搜索与救援)需要潜水员-机器人密切合作。该领域的许多核心功能需要进一步研究,以提高机器人的能力,便于交互。其中一个挑战是自主机器人相对于潜水员接近和定位自己,以启动和促进互动。次优的AUV定位可能导致低质量的交互,并导致潜水员过度的认知和身体负荷。在本文中,我们介绍了一种新的方法,水下机器人自主导航和实现潜水员相对定位,开始互动。该方法仅基于单目视觉,无需全局定位,计算效率高。我们提出了我们的算法,并在模拟和物理AUV上实现了该算法,在受控水池中以闭水试验的形式进行了广泛的评估。分析结果表明,所提出的基于单目视觉的算法能够可靠、高效地在水下机器人上运行。 摘要:Direct communication between humans and autonomous underwater vehicles (AUVs) is a relatively underexplored area in human-robot interaction (HRI) research, although many tasks (eg surveillance, inspection, and search-and-rescue) require close diver-robot collaboration. Many core functionalities in this domain are in need of further study to improve robotic capabilities for ease of interaction. One of these is the challenge of autonomous robots approaching and positioning themselves relative to divers to initiate and facilitate interactions. Suboptimal AUV positioning can lead to poor quality interaction and lead to excessive cognitive and physical load for divers. In this paper, we introduce a novel method for AUVs to autonomously navigate and achieve diver-relative positioning to begin interaction. Our method is based only on monocular vision, requires no global localization, and is computationally efficient. We present our algorithm along with an implementation of said algorithm on board both a simulated and physical AUV, performing extensive evaluations in the form of closed-water tests in a controlled pool. Analysis of our results show that the proposed monocular vision-based algorithm performs reliably and efficiently operating entirely on-board the AUV.

【29】 Towards Learning Generalizable Driving Policies from Restricted Latent Representations 标题:从受限的潜在表征中学习泛化驾驶策略 链接:https://arxiv.org/abs/2111.03688

作者:Behrad Toghi,Rodolfo Valiente,Ramtin Pedarsani,Yaser P. Fallah 机构: University of Central Florida, eduRamtin Pedarsani is with the Department of Electrical and ComputerEngineering 备注:Submitted to IEEE Transactions on Robotics 摘要:在过去的几十年里,训练能够在各种城市和高速公路场景中自主驾驶的智能代理一直是机器人社会的热门话题。然而,在道路拓扑和相邻车辆定位方面,驾驶环境的多样性使得该问题非常具有挑战性。不言而喻,尽管针对自主驾驶的场景特定驾驶政策很有前景,可以提高交通安全和效率,但它们显然不是一个通用的可扩展解决方案。相反,我们寻求的决策方案和驱动策略可以推广到新颖和不可见的环境中。在这项工作中,我们利用了一个关键思想,即人类驾驶员学习其周围环境的抽象表示,这些抽象表示在各种驾驶场景和环境中相当相似。通过这些表示,人类驾驶员能够快速适应新环境,并在看不见的条件下驾驶。形式上,通过施加信息瓶颈,我们提取了一个潜在的表示,该表示最小化了驾驶场景之间的 extit{distance},这是我们引入的一个量化,用于衡量不同驾驶配置之间的相似性。然后将该潜在空间用作Q学习模块的输入,以学习可概括的驾驶策略。我们的实验表明,使用这种潜在表示可以将碰撞次数减少到一半左右。 摘要:Training intelligent agents that can drive autonomously in various urban and highway scenarios has been a hot topic in the robotics society within the last decades. However, the diversity of driving environments in terms of road topology and positioning of the neighboring vehicles makes this problem very challenging. It goes without saying that although scenario-specific driving policies for autonomous driving are promising and can improve transportation safety and efficiency, they are clearly not a universal scalable solution. Instead, we seek decision-making schemes and driving policies that can generalize to novel and unseen environments. In this work, we capitalize on the key idea that human drivers learn abstract representations of their surroundings that are fairly similar among various driving scenarios and environments. Through these representations, human drivers are able to quickly adapt to novel environments and drive in unseen conditions. Formally, through imposing an information bottleneck, we extract a latent representation that minimizes the extit{distance} -- a quantification that we introduce to gauge the similarity among different driving configurations -- between driving scenarios. This latent space is then employed as the input to a Q-learning module to learn generalizable driving policies. Our experiments revealed that, using this latent representation can reduce the number of crashes to about half.

【30】 Holodeck: Immersive 3D Displays Using Swarms of Flying Light Specks 标题:全息甲板:使用成群的飞行光斑的沉浸式3D显示 链接:https://arxiv.org/abs/2111.03657

作者:Shahram Ghandeharizadeh 机构:University of Southern California, Los Angeles, California, USA 备注:A shorter version of this paper appeared in ACM Multimedia Asia (Gold Coast, Australia). this https URL 摘要:无人机(UAV)已经超越了爱好者的平台,实现了环境监测、新闻、电影业、搜索和救援、包裹递送和娱乐。本文描述了使用飞行光点群(FLSs)的3D显示。FLS是一种小型(数百微米大小)无人机,具有一个或多个光源,可生成不同颜色和纹理,亮度可调。一组同步的FLS以预先指定的3D体积(FLS显示器)渲染照明。FLS显示器提供真实的深度,使用户能够通过从不同角度分析其照明来更完整地感知场景。FLS显示器可以是非沉浸式或沉浸式。两者都支持3D音响。非浸入式FLS显示器的大小可能与20世纪80年代的计算机显示器相当,使手术团队能够观察和控制在患者体内进行心脏手术的微型机器人。身临其境的FLS显示器可能有房间那么大,使用户能够与物体进行交互,例如岩石、茶壶。将使用FLS matters构造具有行为的对象。FLS matter将使用户能够触摸和操纵对象,例如,用户可以拿起茶壶或扔石头。身临其境的交互式FLS显示屏将接近《星际迷航》的全息甲板。本文中提出的研究思路的成功实现将为使用大量FLS实现全息甲板提供基本见解。全息甲板将改变人类交流和感知的未来,以及我们如何与信息和数据交互。它将彻底改变我们工作、学习、娱乐、医疗和社交的未来。 摘要:Unmanned Aerial Vehicles (UAVs) have moved beyond a platform for hobbyists to enable environmental monitoring, journalism, film industry, search and rescue, package delivery, and entertainment. This paper describes 3D displays using swarms of flying light specks, FLSs. An FLS is a small (hundreds of micrometers in size) UAV with one or more light sources to generate different colors and textures with adjustable brightness. A synchronized swarm of FLSs renders an illumination in a pre-specified 3D volume, an FLS display. An FLS display provides true depth, enabling a user to perceive a scene more completely by analyzing its illumination from different angles. An FLS display may either be non-immersive or immersive. Both will support 3D acoustics. Non-immersive FLS displays may be the size of a 1980's computer monitor, enabling a surgical team to observe and control micro robots performing heart surgery inside a patient's body. Immersive FLS displays may be the size of a room, enabling users to interact with objects, e.g., a rock, a teapot. An object with behavior will be constructed using FLS-matters. FLS-matter will enable a user to touch and manipulate an object, e.g., a user may pick up a teapot or throw a rock. An immersive and interactive FLS display will approximate Star Trek's Holodeck. A successful realization of the research ideas presented in this paper will provide fundamental insights into implementing a Holodeck using swarms of FLSs. A Holodeck will transform the future of human communication and perception, and how we interact with information and data. It will revolutionize the future of how we work, learn, play and entertain, receive medical care, and socialize.